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GENES OF CAROTENOID BIOSYNTHESIS AND METABOLISM 
AND METHODS OF USE THEREOF 



BACKG R OUN D OF TFTF INVENTION 



Field of th e Invention 



The present invention describes nucleic acid sequences for eukaryotic genes encoding 
e lycopene s-cyclase (also known as e-cyclase and e lycopene cyclase), isopentenyl 
pyrophosphate isomerase (IPP) and p -carotene hydroxylase as well as vectors containing the 
same and hosts transformed with said vectors. The present invention also provides methods 
for augmenting the accumulation of carotenoids, changing the composition of the 
carotenoids, and producing novel and rare carotenoids. The present invention provides 
methods for controlling the ratio or relative amounts of various carotenoids in a host. The 
invention also relates to modified lycopene e-cyclase, IPP isomerase and p -carotene 
hydroxylase. Additionally, the present invention provides a method for screening for genes 
and cDNAs encoding enzymes of carotenoid biosynthesis and metabolism. 



Carotenoid pigments with cyclic endgroups are essential components of the 
photosynthetic apparatus in oxygenic photosynthetic organisms (e.g., cyanobacteria, algae 
and plants; Goodwin, 1980). The symmetrical bicyclic yellow carotenoid pigment P- 
carotene (or, in rare cases, the asymmetrical bicyclic a -carotene) is intimately associated with 
the photosynthetic reaction centers and plays a vital role in protecting against potentially 
lethal photooxidative damage (Koyama, 1991). p -carotene and other carotenoids derived 
from it or from a -carotene also serve as light-harvesting pigments (Siefermann-Harms, 
1987), are involved in the thermal dissipation of excess light energy captured by the light- 
harvesting antenna (Demmig- Adams & Adams, 1992), provide substrate for the biosynthesis 
of the plant growth regulator abscisic acid (Rock & Zeevaart, 1991; Parry & Horgan, 1991), 
and are precursors of vitamin A in human and animal diets (Krinsky, 1987). Plants also 
exploit carotenoids as coloring agents in flowers and fruits to attract pollinators and agents of 
seed dispersal (Goodwin, 1980). The color provided by carotenoids is also of agronomic 
value in a number of important crops. Carotenoids are currently harvested from a variety of 
organisms, including plants, algae, yeasts, cyanobacteria and bacteria, for use as pigments in 
food and feed. 



Background of the Invention 
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The probable pathway for formation of cyclic carotenoids in plants, algae and 
cyanobacteria is illustrated in Figure L Two types of cyclic endgroups or rings are 
commonly found in higher plant carotenoids, these are referred to as the p (beta) and € 
(epsilon) rings (Fig. 3). The precursor acyclic endgroup (no ring structure) is referred to as 
5 the Y (psi) endgroup. The p and z endgroups differ only in the position of the double bond in 
the ring. Carotenoids with two p rings are ubiquitous, and those with one P and one e ring 
are common, but carotenoids with two e rings are uncommon, p -carotene (Fig. 1) has two p- 
endgroups and is a symmetrical compound that is the precursor of a number of other 
important plant carotenoids such as zeaxanthin and violaxanthin (Fig. 2). 
1 0 Genes encoding enzymes of carotenoid biosynthesis have previously been isolated 

from a variety of sources including bacteria (Armstrong et aL, 1989, Mol. Gen. Genet. 216, 
254-268; Misawa et aL, 1990, J. Bacteriol., 172, 6704-12), fungi (Schmidhauser et aL, 1990, 
MoL Cell. Biol. 10, 5064-70), cyanobacteria (Chamovitz et aL, 1990, Z. Naturforsch, 45c, 
^ 482-86; Cunningham et aL, 1994) and higher plants (Bartley et aL, Proc. Natl. Acad. Sci USA 

45 88, 6532-36; Martinez-Ferez & Vioque, 1992, Plant MoL Biol. 18, 981-83). Many of the 
[ S isolated enzymes show a great diversity in structure, function and inhibitory properties 

between sources. For example, phytoene desaturases from the cyanobacterium 
[i Synechococcus and from higher plants and green algae carry out a two-step desaturation to 

yield C -carotene as a reaction product. In plants and cyanobacteria a second enzyme (£- 
V120 carotene desaturase), similar in amino acid sequence to the phytoene desaturase, catalyzes 
two additional desaturations to yield lycopene. In contrast, a single desaturase enzyme from 
Erwinia herbicola and from other bacteria introduces all four double bonds required to form 
lycopene. The Erwinia and other bacterial desaturases bear little amino acid sequence 
similarity to the plant and cyanobacterial desaturase enzymes, and are thought to be of 
25 unrelated ancestry. Therefore, even with a gene in hand from one source, it may be difficult to 
identify a gene encoding an enzyme of similar function in another organism. In particular, 
the sequence similarity between certain of the prokaryotic and eukaryotic genes encoding 
enzymes of carotenoid biosynthesis is quite low. 

Further, the mechanism of gene expression in prokaryotes and eukaryotes appears to 
30 differ sufficiently such that one cannot expect that an isolated eukaryotic gene will be 
properly expressed in a prokaryotic host. 
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The difficulties in isolating genes encoding enzymes with similar functions is 
exemplified by recent efforts to isolate the gene encoding the enzyme that catalyzes the 
formation of P -carotene from the acyclic precursor lycopene. Although a gene encoding an 
enzyme with this function had been isolated from a bacterium, it had not been isolated from 
any photosynthetic procaryote or from any eukaryotic organism. The isolation and 
characterization of the enzyme catalyzing formation of p -carotene in the cyanobacterium 
Synechococcus PCC7942 was described by the present inventors and others (Cunningham et 
al., 1993 and 1994). The amino acid sequence similarity of the cyanobacterial enzyme to the 
various bacterial lycopene p-cyclases is so low (ca. 18-25% overall; Cunningham et al., 
1994) that there is much uncertainty as to whether they share a common ancestry or, instead, 
represent an example of convergent evolution. 

The need remains for the isolation of eukaryotic and prokaryotic genes and cDNAs 
encoding polypeptides involved in the carotenoid biosynthetic pathway, including those 
encoding a lycopene e-cyclase, IPP isomerase and p -carotene hydroxylase. There remains a 
need for methods to enhance the production of carotenoids, to alter the composition of 
carotenoids, and to reduce or eliminate carotenoid production. There also remains a need in 
the art for methods for screening for genes and cDNAs encoding enzymes of carotenoid 
biosynthesis and metabolism. 

SUMMA RY O F THE INVENTION 

Accordingly, a first object of this invention is to provide purified and/or isolated 
nucleic acids which encode enzymes involved in carotenoid biosynthesis; in particular, 
lycopene e-cyclase , IPP isomerase and p -carotene hydroxylase. 

A second object of this invention is to provide purified and/or isolated nucleic acids 
which encode enzymes which produce novel or uncommon carotenoids. 

A third object of the present invention is to provide vectors containing said genes. 

A fourth object of the present invention is to provide hosts transformed with said 
vectors. 

Another object of the present invention is to provide hosts which accumulate novel or 
uncommon carotenoids or which accumulate greater amounts of specific or total carotenoids. 

Another object of the present invention is to provide hosts with inhibited and/or 
altered carotenoid production. 
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Another object of this invention is to secure the expression of eukaryotic carotenoid- 
related genes in a recombinant prokaryotic host. 

Yet another object of the present invention is to provide a method for screening for 
eukaryotic and prokaryotic genes and cDNAs which encode enzymes involved in carotenoid 
biosynthesis and metabolism. 

An additional object of the invention is to provide a method for manipulating 
carotenoid biosynthesis in photosynthetic organisms by inhibiting the synthesis of certain 
enzymatic products to cause accumulation of precursor compounds. 

Another object of the invention is to provide modified lycopene e-cyclase, IPP 
isomerase and p -carotene hydroxylase. 

These and other objects of the present invention have been realized by the present 
inventors as described below. 

A subject of the present invention is an isolated and/or purified nucleic acid sequence 
which encodes for a protein having lycopene e-cyclase, IPP isomerase or p -carotene 
hydroxylase enzyme activity and having the amino acid sequence of SEQ ID NOS: 2, 4, 14- 
21 or 23-27. 

The invention also includes vectors which comprise any of the nucleic acid sequences 
listed above, and host cells transformed with such vectors. 

Another subject of the present invention is a method of producing or enhancing the 
production of a carotenoid in a host cell, comprising inserting into the host cell a vector 
comprising a heterologous nucleic acid sequence which encodes for a protein having 
lycopene e-cyclase, IPP isomerase or p-carotene hydroxylase enzyme activity, wherein the 
heterologous nucleic acid sequence is operably linked to a promoter; and expressing the 
heterologous nucleic acid sequence to produce the protein. 

Yet another subject of the present invention is a method of modifying the production 
of carotenoids in a host cell, the method comprising inserting into the host cell a vector 
comprising a heterologous nucleic acid sequence which produces an RNA and/or encodes for 
a protein which modifies lycopene e-cyclase, EPP isomerase or P -carotene hydroxylase 
enzyme activity, relative to an untransformed host cell, wherein the heterologous nucleic acid 
sequence is operably linked to a promoter; and expressing the heterologous nucleic acid 
sequence in the host cell to modify the production of the carotenoids in the host cell, relative 
to the untransformed host cell. 
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The present invention also includes a method of expressing, in a host cell, a 
heterologous nucleic acid sequence which encodes for a protein having lycopene e-cyclase, 
IPP isomerase or p -carotene hydroxylase enzyme activity, the method comprising inserting 
into the host cell a vector comprising the heterologous nucleic acid sequence, wherein the 
heterologous nucleic acid sequence is operably linked to a promoter; and expressing the 
heterologous nucleic acid sequence. 

Also included is a method of expressing, in a host cell, a heterologous nucleic acid 
sequence which encodes for a protein which modifies lycopene e-cyclase, IPP isomerase or 
p -carotene hydroxylase enzyme activity in the host cell, relative to an untransformed host 
cell, the method comprising inserting into the host cell a vector comprising the heterologous 
nucleic acid sequence, wherein the heterologous nucleic acid sequence is operably linked to a 
promoter; and expressing the heterologous nucleic acid sequence. 

Another subject of the present invention is a method for screening for genes and 
cDNAs which encode enzymes involved in carotenoid biosynthesis and metabolism. 

BRIEF DESCRIPTION OF THE DRAWINGS 
A more complete appreciation of the invention and many of the attendant advantages 
thereof will be readily obtained as the same becomes better understood by reference to the 
following detailed description when considered in connection with the accompanying 
drawings, wherein: 

Figure 1 is a schematic representation of the putative pathway of p -carotene 
biosynthesis in cyanobacteria, algae and plants. The enzymes catalyzing various steps are 
indicated at the left. Target sites of the bleaching herbicides NFZ and MPTA are also 
indicated at the left. Abbreviations: DMAPP, dimethylallyl pyrophosphate; FPP, famesyl 
pyrophosphate; GGPP, geranylgeranyl pyrophosphate; GPP, geranyl pyrophosphate; IPP, 
isopentenyl pyrophosphate; LCY, lycopene cyclase; MVA, mevalonic acid; MPTA, 2-(4- 
methylphenoxy)triethylamine hydrochloride; NFZ, norflurazon; PDS, phytoene desaturase; 
PSY, phytoene synthase; ZDS, C-carotene desaturase; PPPP, prephytoene pyrophosphate. 

Figure 2 depicts possible routes of synthesis of cyclic carotenoids and common plant 
and algal xanthophylls (oxycarotenolds) from neurosporene. Demonstrated activities of the 
P- and e-cyclase enzymes of A. thaliana are indicated by bold arrows labelled with P or e 
respectively. A bar below the arrow leading to e-carotene indicates that the enzymatic 
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activity was examined but no product was detected. The steps marked by an arrow with a 
dotted line have not been specifically examined. Conventional numbering of the carbon 
atoms is given for neurosporene and a -carotene. Inverted triangles (▼) mark positions of the 
double bonds introduced as a consequence of the desaturation reactions. 

Figure 3 depicts the carotene endgroups which are found in plants. 

Figure 4 is a DNA sequence and the predicted amino acid sequence of a lycopene e- 
cyclase cDNA isolated from A. thaliana (SEQ ID NOS: 1 and 2). These sequences were 
deposited under Genbank accession number U50738. This cDNA is incorporated into the 
plasmid pATeps. 

Figure 5 is a DNA sequence encoding the p -carotene hydroxylase isolated from A. 
thaliana (SEQ ID NO: 3). This cDNA is incorporated into the plasmid pATOHB. 

Figure 6 is an alignment of the predicted amino acid sequences of A. thaliana P- 
carotene hydroxylase (SEQ ID NO: 4) with those of the bacterial p -carotene hydroxylase 
enzymes from Alicalgenes sp. (SEQ ID NO: 5) (Genbank D58422), Erwinia herbicola EholO 
(SEQ ID NO.: 6) (GenBank M872280), Erwinia uredovora (SEQ ID NO.: 7) (GenBank 
D90087) and Agrobacterium aurianticum (SEQ ID NO.: 8) (GenBank D58420). A 
consensus sequence is also shown. All five genes are identical where a capital letter appears 
in the consensus. A lowercase letter indicates that three of five, including A. thaliana, have 
the identical residue. TM; transmembrane. 

Figure 7 is a DNA sequence of a cDNA encoding an IPP isomerase isolated from A. 
thaliana (SEQ ID NO: 9). This cDNA is incorporated into the plasmid pATDPS. 

Figure 8 is a DNA sequence of a second cDNA encoding another IPP isomerase 
isolated from A. thaliana (SEQ ID NO: 10). This cDNA is incorporated into the plasmid 
pATDP7. 

Figure 9 is a DNA sequence of a cDNA encoding an IPP isomerase isolated from 
Haematococcus pluvialis (SEQ ID NO: 11). This cDNA is incorporated into the plasmid 
pHP04. 

Figure 10 is a DNA sequence of a second cDNA encoding another IPP isomerase 
isolated from Haematococcus pluvialis (SEQ ID NO: 12). This cDNA is incorporated into 
the plasmid pHP05. 

Figure 1 1 is an alignment of the amino acid sequences predicted by IPP isomerase 
cDNAs isolated from A. thaliana (SEQ ID NO.: 16 and 18), H, pluvialis (SEQ ID NOS: 14 
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and 15), Clarkia breweri (SEQ ID NO.: 1 7) (See, Blanc & Pichersky, Plant Physiol. (1995) 
108:855; Genbank accession no. X82627) and Saccharomyces cerevisiae (SEQ ED NO.: 19) 
(Genbank accession no. J05090). 

Figure 12 is a DNA sequence of the cDNA encoding an IPP isomerase isolated from 
5 Tagetes erecta (marigold; SEQ ID NO: 13). This cDNA is incorporated into the plasmid 
pPMDPl. xxx's denote a region not originally sequenced. Figure 21 A shows the complete 
marigold sequence. 

Figure 13 is an alignment of the consensus sequence of four plant (J -cyclases (SEQ ID 
NO.: 20) with the A. thaliana lycopene e-cyclase (SEQ ID NO.: 21). A capital letter in the 
10 plant p consensus is used where all four P -cyclase genes predict the same amino acid residue 
in this position. A small letter indicates that an identical residue was found in three of the 
four. Dashes indicate that the amino acid residue was not conserved and dots in the sequence 
C denote a gap. A consensus for the aligned sequences is given, in capital letters below the 
j alignment, where the p- and e-cyclases have the same amino acid residue. Arrows indicate 
: f 5 some of the conserved amino acids that will be used as junction sites for construction of 
J chimeric cyclases with novel enzymatic activities. Several regions of interest including a 

in sequence signature indicative of a dinucleotide-binding motif and two predicted 

transmembrane (TM) helical regions are indicated below the alignment and are underlined, 
i Figure 14 shows the nucleotide (SEQ ID NO:22) and amino acid sequences (SEQID 

: jo NO:23) of the Adonis palaestina (pheasant's eye) e-cyclase cDNA #5. 

Figure 15A shows the nucleotide (SEQ ID NO;24) and amino acid sequences (SEQ^ 
ID NO:25) of a potato e-cyclase cDNA. Figure 15B shows the amino acid sequence (SEQ ID^ 
NO:26) of a chimeric lettuce/potato lycopene e-cyclase. Amino acids in lower case are from 
the lettuce cDNA and those in upper case are from the potato cDNA. The product of this 
25 chimeric cDNA has e-cyclase activity and converts lycopene to the monocyclic 5-carotene. 

Figure 16 shows a comparison between the amino acid sequences of the Arabidopsis 
e-cyclase (SEQ ID NO:27) and the potato e-cyclase (SEQ ID NO:25). 

Figure 17 A shows the nucleotide sequence of the Adonis palaestina Ipil (SEQ ID 
NO:28) and Figure 17B shows the nucleotide sequence of the Adonis palaestina Ipi2 (SEQ 
30 ID NO: 29). 
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Figure 18A shows the nucleotide sequence of the Haematoccus pluvialis Ipil (SEQ 
ID NO: 1 1 ) and Figure 1 8B shows the nucleotide sequence of the Haematoccus pluvialis Ipi2 
(SEQIDNO:30). 

Figure 19A shows the nucleotide sequence of the Lactuca saliva (romaine lettuce) 
5 Ipil (SEQ ID NO:3 1) and Figure 19B shows the nucleotide sequence of the Lactuca sativa 
Ipi2 (SEQ ID NO: 32). 

Figure 20 shows the nucleotide sequence of the Chlamydomonas reinhardtii Ipil 

(SEQIDNO:33). 

Figure 21 A shows the nucleotide sequence of the Tagetes erecta (marigold) Ipil (SEQ 
10 ID NO:34) and Figure 2 IB shows the nucleotide sequence of the Oryza sativa (rice) Ipil 
(SEQIDNO:35). 

Figure 22 shows a amino acid sequence alignment of various plant and green algal 
: J isopentenyl isomerases (IPI) (SEQ ID NOS:16, 36-45). 

Figure 23 shows a comparison between Adonis palaestina e-cyclase cDNA #3 and 
"25 cDNA #5 nucleotide sequences. 

y Figure 24 shows a comparison between Adonis palaestina e-cyclase cDNA #3 and 

J] cDNA #5 predicted amino acid sequences. 

Figure 25 shows a sequence alignment of various plant P- and e-cyclases. Those 
sequences outlined in grey denote identical sequences among the e-cyclases. Those 
' 120 sequences outlined in black denote identical sequences among both the P- and e-cyclases. 

Figure 26 shows a sequence alignment of the plant e-cyclases from Figure 25. Those 
sequences outlined in black denote identical sequences among the e-cyclases. 

Figure 27 is a dendrogram or "tree" illustrating the degree of amino acid sequence 
similarity for various lycopene p- and e-cyclases. 
25 Figure 28 shows a comparison between Arabidopsis e-cyclase and lettuce e-cyclase 

predicted amino acid sequences. 

TVRSflRTPTTON OF THE PKFFFKKFD FMBODTMRNTS 
The present invention includes an isolated and/or purified nucleic acid sequence 
which encodes for a protein having lycopene e-cyclase, IPP isomerase or P -carotene 
30 hydroxylase enzyme activity and having the amino acid sequence of SEQ ID NOS: 2, 4, 14- 
21, 23 or 25-27. Nucleic acids encoding lycopene e-cyclase, P-carotene hydroxylase and IPP 
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isomerases have been isolated from several genetically distant sources. 

The present inventors have isolated nucleic acids encoding the enzyme IPP isomerase, 
which catalyzes the reversible conversion of isopentenyl pyrophosphate (IPP) to 
dimethylallyl pyrophosphate (DMAPP). IPP isomerase cDNAs were isolated from the plants 
5 A. thaliana, Tagetes erecta (marigold), Adonis palaestina (pheasant's eye), Lactuca saliva 
(romaine lettuce) and from the green algae K pluvialis and Chlamydomonas reinhardtii. 
Alignments of the amino acid sequences predicted by some of these cDNAs are shown in 
Figures 12 and 22, Plasmids containing some of these cDNAs were deposited with the 
American Type Culture Collection, 12301 Parklawn Drive, Rockville MD 20852 on March 4, 
10 1996 under ATCC accession numbers 98000 (pHP05 - H. pluvialis); 98001 (pMDPl - 
marigold); 98002 (pATDP7 - A. thaliana) and 98004 (pHP04 - K pluvialis). 

The present inventors have also isolated nucleic acids encoding the enzyme P- 
r[ carotene hydroxylase, which is responsible for hydroxy lating the P-endgroup in carotenoids. 
d The nucleic acid of the present invention is shown in SEQ ID NO; 3 and Figure 5. The full 
r length cDNA product hydroxylates both end groups of p -carotene as do products of cDNAs 

which encode proteins truncated by up to 50 amino acids from the N- terminus. Products of 
genes which encode proteins truncated between about 60-1 10 amino acids from the N- 
S terminus preferentially hydroxylate only one ring. A plasmid containing this gene was 
; jr ? deposited with the American Type Culture Collection, 12301 Parklawn Drive, Rockville MD 

:10 20852 on March 4, 1996 under ATCC accession number 98003 (pATOHB - A. thaliana). 

The present inventors have also isolated nucleic acids encoding the enzyme lycopene 
e-cyclase, which is responsible for the formation of e-endgroups in carotenoids. The A. 
thaliane e-cyclase adds an e ring to only one end of the symmetrical lycopene while the 
related p -cyclase adds a ring at both ends. The A. thaliana cDNA of the present invention is 
25 shown in Figure 4 and SEQ ID NO: 1 . A plasmid containing this gene was deposited with 
the American Type Culture Collection, 12301 Parklawn Drive, Rockville MD 20852 on 
March 4, 1996 under ATCC accession number 98005 (pATeps - A. thaliana). 

In addition, lycopene e-cyclases have been identified in lettuce and in Adonis 
palaestina (cDNA #5) which encode enzymes that convert lycopene to the bicyclic e- 
30 carotene (s,e-carotene). An additional cDNA from Adonis palaestina (cDNA #3) encodes a 
lycopene e-cyclase which converts lycopene into 6-carotene (e,i|i -carotene) and differs from 
the lycopene e-cyclase which forms bicyclic e-carotene (e,e-carotene) by only 5 amino acids. 
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One or more of these amino acids may be modified by alteration of the nucleotide sequence 
in the #5 cDNA to obtain an enzyme which forms the bicyclic e,e-carotene. The sequences 
of the Adonis palaestina and Arabidopsis thaliana e -cyclases have about 70% nucleotide 
identity and about 72% amino acid identity. 

Initial experiments by the inventors with chimeric genes indicated that the part of the 
e-cyclase which is responsible for adding 2 e rings to form e,e-carotene is the carboxy 
terminal portion of the gene. The lettuce e-cyclase adds two e rings to form e,e-carotene. A 
DNA encoding a partial potato €-cyclase (missing its amino terminal portion), when 
combined with an amino terminal region from the lettuce e-cyclase gene, produces a 
monocyclic 8 -carotene (e,i|r-carotene). With the discovery of the differences between the 
Adonis palaestina clone #3 and clone #5, the specific amino acids responsible for the addition 
of an extra e ring have been identified (Figure 24). Specifically, amino acid 55 is Thr in 
clone #3 and Ser in clone #5, amino acid 210 is Asn in clone #3 and Asp in clone #5, amino 
acid 231 is Asp in clone #3 and Glu in clone #5, amino acid 352 is He in clone #3 and Val in 
clone #5, and amino acid 524 is Lys in clone #3 and Arg in clone #5. It can be appreciated 
that these changes are quite conservative, as only one change, at amino acid 210, changes the 
charge of the protein. 

Thus, it is clear that the nucleic acids of the invention encoding the enzymes as 
presently disclosed may be altered to increase a particularly desirable property of the enzyme, 
to change a property of the enzyme, or to diminish an undesirable property of the enzyme. 
Such modifications can be by deletion, substitution, or insertion of one or more amino acids, 
and can be performed by routine enzymatic manipulation of the nucleic acid encoding the 
enzyme (such as by restriction enzyme digestion, removal of nucleotides by mung bean 
nuclease or Bal3l 9 insertion of nucleotides by Klenow fragment, and by religation of the 
ends), by site-directed mutagenesis, or may be accidental, such as by low fidelity PCR or 
those obtained through mutations in hosts that are producers of the enzymes. These 
techniques as well as other suitable techniques are well known in the art. 

Mutations can be made in the nucleic acids of the invention such that a particular 
codon is changed to a codon which codes for a different amino acid. Such a mutation is 
generally made by making the fewest nucleotide changes possible. A substitution mutation 
of this sort can be made to change an amino acid in the resulting protein in a non- 
conservative manner (i.e., by changing the codon from an amino acid belonging to a grouping 
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of amino acids having a particular size or characteristic to an amino acid belonging to another 
grouping) or in a conservative manner (i.e., by changing the codon from an amino acid 
belonging to a grouping of amino acids having a particular size or characteristic to an amino 
acid belonging to the same grouping). Such a conservative change generally leads to less 
5 change in the structure and function of the resulting protein. A non-conservative change is 
more likely to alter the structure, activity or function of the resulting protein. The present 
invention should be considered to include sequences containing conservative changes which 
do not significantly alter the activity or binding characteristics of the resulting protein. 
The following is one example of various groupings of amino acids: 

10 Amino acids with non polar R groups: Alanine, Valine, Leucine, Isoleucine, Proline, 
Phenylalanine, Tryptophan and Methionine. 

p Amino acids with uncharged polar R groups : Glycine, Serine, Threonine, Cysteine, Tyrosine, 
Asparagine and Glutamine. 

Q Amino acids with charged polar R groups (negatively charged at Ph 6.0): Aspartic acid and 

■ 15 Glutamic acid. 

Basic amino acids (positively charged at pH 6.0): Lysine, Arginine and Histidine. 

Another grouping may be those amino acids with phenyl groups: Phenylalanine, 

■ 'n Tryptophan and Tyrosine. 

; Jj Another grouping may be according to molecular weight (i.e., size of R groups). 

CEO Particularly preferred substitutions are: 

- Lys for Arg and vice versa such that a positive charge may be maintained; 

- Glu for Asp and vice versa such that a negative charge may be maintained; 

- Ser for Thr such that a free -OH can be maintained; and 

- Gin for Asn such that a free NH 2 can be maintained. 

25 Amino acid substitutions may also be introduced to substitute an amino acid with a 

particularly preferable property. For example, a Cys may be introduced to provide a potential 
site for disulfide bridges with another Cys. A His may be introduced as a particularly 
"catalytic" site (i.e., His can act as an acid or base and is the most common amino acid in 
biochemical catalysis). Pro may be introduced because of its particularly planar structure, 

30 which induces (J -turns in the protein's structure. 

It is clear that certain modifications of SEQ ID NOS: 2, 4, 14-21, 23 or 25-27 can take 
place without destroying the activity of the enzyme. It is noted especially that truncated 
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versions of the nucleic acids of the invention are functional. For example, several amino 
acids (from 1 to about 120) can be deleted from the N-terminus of the lycopene e-cyclases of 
the invention, and a functional protein can still be produced. This fact is made especially 
clear from Figure 25, which shows a sequence alignment of several plant e-cyclases. As can 
be seen from Figure 25, there is an enormous amount of sequence disparity between amino 
acid sequences 2 to about 50-70 (depending on the particular sequence, since gaps are 
present). There is less, but also a substantial amount of, sequence dissimilarity between about 
50-70 to about 90-120 (depending on the particular sequence). Thereafter, the sequences are 
fairly conserved, except for small pockets of dissimilarity between about 275-295 to about 
285-305 (depending on the particular sequence), and between about 395-415 to about 410- 
430 (depending on the particular sequence). 

The present inventors have found that the amount of the 5' region present in the 
nucleic acids of the invention can alter the activity of the enzyme. Instead of diminishing 
activity, truncating the 5' region of the nucleic acids of the invention may result in an enzyme 
with a different specificity. Thus, the present invention relates to nucleic acids and enzymes 
encoded thereby which are truncated to within 0-50, preferably 0-25, codons of the 5' 
initiation codon of their prokaryotic counterparts as determined by alignment maps as 
discussed below. 

For example, when the cDNA encoding A. thaliana p -carotene hydroxylase was 
truncated, the resulting enzyme catalyzed the formation of p-cryptoxanthin as the major 
product and zeaxanthin as minor product; in contrast to its normal production of zeaxanthin. 

The present invention is intended to include those nucleic acid and amino acid 
sequences in which substitutions, deletions, additions or other modifications have taken 
place, as compared to SEQ ID NOS: 2, 4, 14-21, 23 or 25-27, without destroying the activity 
of the enzyme. Preferably, the substitutions, deletions, additions or other modifications take 
place at the 5' end, or any other of those positions which already show dissimilarity between 
any of the presently disclosed amino acid sequences (see also Figure 25) or other amino acid 
sequences which are known in the art and which encode the same enzyme (i.e., lycopene e- 
cyclase, IPP isomerase or p -carotene hydroxylase). 

In each case, nucleic acid and amino acid sequence similarity and identity is measured 
using sequence analysis software, for example, the Sequence Analysis, Gap, or BestFit 
software packages of the Genetics Computer Group (University of Wisconsin Biotechnology 
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Center, 1710 University Avenue, Madison, Wisconsin 53705), MEGAlign (DNAStar, Inc., 
1228 S. Park St., Madison, Wisconsin 53715), or Mac Vector (Oxford Molecular Group, 2105 
S. Bascom Avenue, Suite 200, Campbell, California 95008). Such software uses algorithms 
to match similar sequences by assigning degrees of identity to various substitutions, 
deletions, and other modifications, and includes detailed instructions as to useful parameters, 
etc., such that those of routine skill in the art can easily compare sequence similarities and 
identities. An example of a useful algorithm in this regard is the algorithm of Needleman and 
Wunsch, which is used in the Gap program discussed above. This program finds the 
alignment of two complete sequences that maximizes the number of matches and minimizes 
the number of gaps. Another useful algorithm is the algorithm of Smith and Waterman, 
which is used in the BestFit program discussed above. This program creates an optimal 
alignment of the best segment of similarity between two sequences. Optimal alignments are 
found by inserting gaps to maximize the number of matches using the local homology 
algorithm of Smith and Waterman. 

Conservative (i.e. similar) substitutions typically include substitutions within the 
following groups: glycine and alanine; valine, isoleucine and leucine; aspartic acid, glutamic 
acid, asparagine and glutamine; serine and threonine; lysine and arginine; and phenylalanine 
and tyrosine. Substitutions may also be made on the basis of conserved hydrophobicity or 
hydrophilicity (see Kyte and Doolittle, J. MoL Biol 157: 105-132 (1982)), or on the basis of 
the ability to assume similar polypeptide secondary structure (see Chou and Fasman, Adv. 
Enzymol. 47: 45-148 (1978)). 

If comparison is made between nucleotide sequences, preferably the length of 
comparison sequences is at least 50 nucleotides, more preferably at least 60 nucleotides, at 
least 75 nucleotides or at least 100 nucleotides. It is most preferred if comparison is made 
between the nucleic acid sequences encoding the enzyme coding regions necessary for 
enzyme activity. If comparison is made between amino acid sequences, preferably the length 
of comparison is at least 20 amino acids, more preferably at least 30 amino acids, at least 40 
amino acids or at least 50 amino acids. It is most preferred if comparison is made between 
the amino acid sequences in the enzyme coding regions necessary for enzyme activity. 

It should be appreciated that also within the scope of the present invention are nucleic 
acid sequences encoding lycopene €-cyclases, IPP isomerases and P -carotene hydroxylases 
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which code for enzymes having the same amino acid sequence as SEQ ID NOS: 2, 4, 14-21, 
23 or 25-27, but which are degenerate to the nucleic acids specifically disclosed herein. 

The amino acid residues described herein are preferred to be in the "L" isomeric form. 
However, residues in the "D" isomeric form can be substituted for any L-amino acid residue, 
as long as the desired functional property of immunoglobulin-binding is retained by the 
polypeptide. 

In accordance with the present invention there may be employed conventional 
molecular biology, microbiology, and recombinant DNA techniques within the skill of the 
art. Such techniques are explained fully in the literature. See, e.g., Sambrook et al, 
"Molecular Cloning: A Laboratory Manual" (1989); "Current Protocols in Molecular 
Biology" Volumes I-III [Ausubel, R. M., ed. (1994)]; "Cell Biology: A Laboratory 
Handbook" Volumes I-III [J. E. Celis, ed. (1994))]; "Current Protocols in Immunology" 
Volumes I-III [Coligan, J. E„ ed. (1994)]; "Oligonucleotide Synthesis" (MJ. Gait ed. 1984); 
"Nucleic Acid Hybridization" [B.D. Hames & SJ. Higgins eds. (1985)]; "Transcription And 
Translation" [B.D. Hames & SJ. Higgins, eds. (1984)]; "Animal Cell Culture" [R.I. 
Freshney, ed. (1986)]; "Immobilized Cells And Enzymes" [ERL Press, (1986)]; B. Perbal, "A 
Practical Guide To Molecular Cloning" (1984). 

The present invention also includes vectors. Suitable vectors according to the present 
invention comprise a nucleic acid of the invention encoding an enzyme involved in 
carotenoid biosynthesis or metabolism and a suitable promoter for the host, and can be 
constructed using techniques well known in the art (for example Sambrook et al., Molecul a r 
f innin g A T Moratory Manual . Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 
1989; Ausubel et al., rim-rent Protocols in Molecular Biology . Greene Publishing and Wiley 
Interscience, New York, 1991). Suitable vectors for eukaryotic expression in plants are 
described in Frey et al., Plant J. (1995) 8(5):693 and Misawa et al, 1994a; incorporated herein 
by reference. Suitable vectors for prokaryotic expression include pACYC184, pUCl 19, and 
pBR322 (available from New England BioLabs, Bevery, MA) and pTrcHis (Invitrogen) and 
pET28 (Novagen) and derivatives thereof. The vectors of the present invention can 
additionally contain regulatory elements such as promoters, repressors, selectable markers 
such as antibiotic resistance genes, etc. 

The nucleic acids encoding the carotenoid enzymes as described above, when cloned 
into a suitable expression vector, can be used to overexpress these enzymes in a plant 
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expression system or to inhibit the expression of these enzymes. For example, a vector 
containing the gene encoding lycopene €-cyclase can be used to increase the amount of ce- 
carotene and carotenoids derived from a-carotene (such as lutein and ct-cryptoxanthin) in an 
organism and thereby alter the nutritional value, pharmacology and visual appearance value 
of the organism. 

Therefore, the present invention includes a method of producing or enhancing the 
production of a carotenoid in a host cell, relative to an untransformed host cell, the method 
comprising inserting into the host cell a vector comprising a heterologous nucleic acid 
sequence which encodes for a protein having lycopene e-cyclase, IPP isomerase or P- 
carotene hydroxylase enzyme activity, wherein the heterologous nucleic acid sequence is 
operably linked to a promoter; and expressing the heterologous nucleic acid sequence to 
produce the protein. 

The present invention also includes a method of modifying the production of 
carotenoids in a host cell, the method comprising inserting into the host cell a vector 
comprising a heterologous nucleic acid sequence which produces an RNA and/or encodes for 
a protein which modifies lycopene e-cyclase, IPP isomerase or p -carotene hydroxylase 
enzyme activity, relative to an untransformed host cell, wherein the heterologous nucleic acid 
sequence is operably linked to a promoter; and expressing the heterologous nucleic acid 
sequence in the host cell to modify the production of the carotenoids in the host cell, relative 
to the untransformed host cell. 

The term "modifying the production" means that the amount of carotenoids produced 
in the host cell can be enhanced, reduced, or left the same, as compared to the untransformed 
host cell. In accordance with one embodiment of the present invention, the make-up of the 
carotenoids (i.e., the specific carotenoids produced) is changed vis a vis each other, and this 
change in make-up may result in either a net gain, net loss, or no net change in the total 
amount of carotenoids produced in the cell. In accordance with another embodiment of the 
present invention, the production or the biochemical activity of the carotenoids (or the 
enzymes which catalyze their formation) is enhanced by the insertion of an enzyme-encoding 
nucleic acid of the invention. In yet another embodiment of the invention, the production or 
the biochemical activity of the carotenoids (or the enzymes which catalyze their formation) 
may be reduced or inhibited by a number of different approaches available to those skilled in 
the art, including but not limited to such methodologies or approaches as anti-sense (e.g., 
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Gray et al (1992) Plant Mol. Biol. 19:69-87), ribozymes (e.g., Wegener et al (1994) Mol. 
Gen. Genet. 245:465-470), co-suppression (e.g., Fray and Grierson (1993) Plant Mol. Biol. 
22:589-602), targeted disruption of the gene (e.g., Schaefer et al. (1997) Plant J. 1 1:1195- 
1206), intracellular antibodies (e.g., Rondon and Marasco (1997) Ann. Rev. Microbiol. 
5 1 :257-283) or whatever other approaches rely on the knowledge or availability of the nucleic 
acid or amino acid sequences of the invention and/or portions thereof, to thereby reduce 
accumulation of carotenoids with € rings and compounds derived from them (for e-cyclase 
inhibition), or carotenoids with hydroxylated p rings and compounds derived from them (for 
P -hydroxylase inhibition), or, in the case if IPP isomerase, accumulation of any isoprenoid 
compound. 

Preferably, at least a portion of the nucleic acid sequences used in the methods, 
vectors and host cells of the invention codes for an enzyme having an amino acid sequence 
which is at least 85% identical, preferably at least 90%, at least 95% or completely identical 
to SEQ ID NOS: 2, 4, 14-21, 23 or 25-27. Sequence identity is determined as noted above. 
Preferably, sequence additions, deletions or other modifications are made as indicated above, 
so as to not affect the function of the particular enzyme. 

In a preferred embodiment, vectors are manufactured which contain a DNA encoding 
a eukaryotic IPP isomerase upstream of a DNA encoding a second eukaryotic carotenoid 
enzyme. The inventors have discovered that inclusion of an IPP isomerase gene increases the 
supply of substrate for the carotenoid pathway; thereby enhancing the production of 
carotenoid endproducts, as compared to a host cell which is not transformed with such a 
vector. This is apparent from the much deeper pigmentation in carotenoid-accumulating 
colonies of E. coli which also contain one of the aforementioned IPP isomerase genes when 
compared to colonies that lack this additional IPP isomerase gene. Similarly, a vector 
comprising an IPP isomerase gene can be used to enhance production of any secondary 
metabolite of dimethylallyl pyrophosphate and/or isopentenyl pyrophosphate (such as 
isoprenoids, steroids, carotenoids, etc.). The term "isoprenoid" is intended to mean any 
member of the class of naturally occurring compounds whose carbon skeletons are composed, 
in part or entirely, of isopentyl C 5 units. Preferably, the carbon skeleton is of an essential oil, 
a fragrance, a rubber, a carotenoid, or a therapeutic compound, such as paclitaxel. 

A v ^ or containing the cDNA encoding a lycopene e-cyclase of the invention, 
preferably the lettuce lycopene e-cyclase or Adonis e-cyclase #5, can be used to increase the 



-16- 



WO 99/63055 



PCT/US99/12121 



amount of bi cyclic €-carotene in an organism and thereby alter the nutritional value, 
pharmacology and visual appearance value of the organism. In addition, the transformed 
organism can be used in the formulation of therapeutic agents, for example in the treatment of 
cancer (see Mayne et al (1996) FASEB J. 10:690-701; Tsushima et al (1995) Biol. Pharm. 
Bull. 18:227-233). 

An antisense strand of a nucleic acid of the invention can be inserted into a vector. 
For example, the lycopene e-cyclase gene can be inserted into a vector and incorporated into 
the genomic DNA of a host, thereby inhibiting the synthesis of e,p -carotenoids (lutein and a- 
carotene) and enhancing the synthesis of p,p -carotenoids (zeaxanthin and p -carotene). 

The present invention also relates to novel enzymes which are encoded by the amino 
acid sequences of the invention, or portions thereof. 

The present invention also relates to novel enzymes which can transform known 
carotenoids into novel or uncommon products. Currently e-carotene (see Figure 2) and y- 
carotene are commonly produced only in minor amounts. As described below, an enzyme 
can be produced which transforms lycopene to y -carotene and lycopene to e-carotene, With 
these products in hand, bulk synthesis of other carotenoids derived from them are possible. 
For example, €-carotene can be hydroxylated to form lactucaxanthin, an isomer of lutein (one 
e and one p ring) and zeaxanthin (two P rings) where both endgroups are, instead, e rings. 

In addition to novel enzymes produced by truncating the 5' region of known enzymes, 
as discussed above, novel enzymes which can participate in the formation of unusual 
carotenoids can be formed by replacing portions of one gene with an analogous sequence 
from a structurally related gene. For example, p -cyclase and e-cyclase are structurally 
related (see Figure 13). By replacing a portion of p -lycopene cyclase with the analogous 
portion of e-cyclase, an enzyme which produces y -carotene will be produced (one P 
endgroup). Further, by replacing a portion of the lycopene e-cyclase with the analogous 
portion of p -cyclase, an enzyme which produces e-carotene will be produced (with some 
exceptions, such as the lettuce s-cyclase, plant e-cyclases normally produce a compound with 
one e-endgroup, 6-carotene). Similarly, p -hydroxylase could be modified to produce 
enzymes of novel function by creation of hybrids with e-hydroxylase. 

Host systems according to the present invention can comprise any organism that 
already produces carotenoids or which has been genetically modified to produce carotenoids. 
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The IPP isomerase genes are more broadly applicable for enhancing production of any 
product dependent on DMAPP and/or IPP as a precursor. 

Organisms which already produce carotenoids include plants, algae, some yeasts, 
fungi and cyanobacteria and other photosynthetic bacteria. Transformation of these hosts 
with vectors according to the present invention can be done using standard techniques such as 
those described in Misawa et al., (1990) supra; Hundle et al, (1993) supra; Hundle et al, 
(1991) supra; Misawa et al., (1991) supra; Sandmann et aL, supra; and Schnurr et al, supra. 

Transgenic organisms can be constructed which include the nucleic acid sequences of 
the present invention (Bird et al, 1991; Bramley et al, 1992; Misawa et al, 1994a; Misawa et 
al, 1994b; Cunningham et al, 1993). The incorporation of these sequences can allow the 
controlling of carotenoid biosynthesis, content, or composition in the host cell. These 
transgenic systems can be constructed to incorporate sequences which allow for the 
overexpression of the nucleic acids of the present invention. Transgenic systems can also be 
constructed containing antisense expression of the nucleic acid sequences of the present 
invention. Such antisense expression would result in the accumulation of the substrates of the 
substrates of the enzyme encoded by the sense strand. 

A method for screening for eukaryotic genes which encode enzymes involved in 
carotenoid biosynthesis comprises transforming a prokaryotic host with a nucleic acid which 
may contain a eukaryotic or prokaryotic carotenoid biosynthetic gene; culturing said 
transformed host to obtain colonies; and screening for colonies exhibiting a different color 
than colonies of the untransformed host. 

Suitable hosts include E. coli, cyanobacteria such as Synechococcus and 
Synechocystis, alga and plant cells. E. coli are preferred. 

In a preferred embodiment, the above "color complementation" screening protocol can 
be enhanced by using mutants which are either (1) deficient in at least one carotenoid 
biosynthetic gene or (2) overexpress at least one carotenoid biosynthetic gene. In either case, 
such mutants will accumulate carotenoid precursors. 

Prokaryotic and eukaryotic DNA or cDNA libraries can be screened in total for the 
presence of genes of carotenoid biosynthesis, metabolism and degradation. Preferred 
organisms to be screened include photosynthetic organisms. 
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E. coli can be transformed with these eukaryotic cDNA libraries using conventional 
methods such as those described in Sambrook et al, 1989 and according to protocols 
described by the vendors of the cloning vectors. 

For example, the cDNA libraries in bacteriophage vectors such as lambdaZAP 
(Stratagene) or lambda ZIPLOX (Gibco BRL) can be excised en masse and used to transform 
E.coli. 

Transformed E. coli can be cultured using conventional techniques. The culture broth 
preferably contains antibiotics to select and maintain plasmids. Suitable antibiotics include 
penicillin, ampicillin, chloramphenicol, etc. Culturing is typically conducted at 15-40°C, 
preferably at room temperature or slightly above (18-28°C), for 12 hours to 7 days. 

Cultures are plated and the plates are screened visually for colonies with a different 
color than the colonies of the host E, coli transformed with the empty plasmid cloning vector. 
For example, E. coli transformed with the plasmid, pAC-BETA (described below), produce 
yellow colonies that accumulate p -carotene. After transformation with a cDNA library, 
colonies which contain a different hue than those formed by £. co/z/pAC-BETA would be 
expected to contain enzymes which modify the structure or accumulation of p -carotene. 
Similar E. coli strains can be engineered which accumulate earlier products in carotenoid 
biosynthesis, such as lycopene, y -carotene, etc. 

Having generally described this invention, a further understanding can be obtained by 
reference to certain specific examples which are provided herein for purposes of illustration 
only and are not intended to be limiting unless otherwise specified. 

EXAMPLE 

I. Isolation of p -carotene hydroxylase 
Plasmid Construction 

An 8*6kb BgLH fragment containing the carotenoid biosynthetic genes of Erwinia 
herbicola was first cloned in the BamHI site of plasmid vector pACYC184 (chloramphenicol 
resistant), and then a l.lkb BamHI fragment containing the E. herbicola P-carotene 
hydroxylase (CrtZ) was deleted. E.coli strains containing the resulting plasmid, p AC-BET A, 
accumulate B-carotene and form yellow colonies (Cunningham et al., 1994). 

A full length cDNA encoding IPP isomerase of Haematococcus pluvialis (HP04) was 
first excised with BamHI and Kpnl from pBluescript SK-, and then ligated into the 
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corresponding sites of the pTrcHisA vector with high-level expression from the trc promoter 
(Invitrogen, Inc.). A fragment containing the IPP isomerase and trc promoter was 
subsequently excised with EcoRV and Kpnl, treated with the Klenow fragment of DNA 
polymerase to produce blunt ends, and ligated in the Klenow-treated Hindlll site of pAC- 
BETA. E.coli cells transformed with this new plasmid pAC-BETA-04 form orange colonies 
on LB plates (vs. yellow for those containing pAC-BETA) and cultures accumulate 
substantially more P -carotene (ca. two fold) than those that contain p AC-BETA. 

Screening of an Arahidopsis rDNA T Jhrary 

Several k cDNA expression libraries of Arahidopsis were obtained from the 
Arahidopsis Biological Resource Center (Ohio State University, Columbus, OH) (Kieber et 
aL, 1993). The k cDNA libraries were excised in vivo using Stratagene's ExAssist SOLR 
system to produce a phagemid cDNA library wherein each phagemid contained also a gene 
conferring resistance to the antibiotic ampicillin. 

E.coli strain DH10BZIP was chosen as the host cell for the screening and pigment 
production, although we have also used TOPI OF' and XL 1 -Blue for this purpose. DH10B 
cells were transformed with plasmid pAC-BETA-04 and were plated on LB agar plates 
containing chloramphenicol at 50 ng/ml (from United States Biochemical Corporation). The 
phagemid Arahidopsis cDNA library was then introduced into DH10B cells already 
containing pAC-BETA-04. Transformed cells containing both pAC-BETA-04 and 
Arahidopsis cDNA library phagemids were selected on chloramphenicol plus ampicillin (150 
fig/ml) agar plates. Maximum color development occurred after 3 to 7 days incubation at 
room temperature, and the rare bright yellow colonies were selected from a background of 
many thousands of orange colonies on each agar plate. Selected colonies were inoculated 
into 3 ml liquid LB medium containing ampicillin and chloramphenicol, and cultures were 
incubated at room temperature for 1-2 days, with shaking. Cells were then harvested by 
centrifugation and extracted with acetone in microfuge tubes. After centrifugation, the 
pigmented extract was spotted onto silica gel thin-layer chromatography (TLC) plates, and 
developed with a hexane:ether (1:1, by volume) mobile phases. B-carotene hydroxylase- 
encoding cDNAs were identified based on the appearance of a yellow pigment that co- 
migrated with zeaxanthin on the TLC plates. 
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Subcloning and Sequencing 

The plasmid containing the B-carotene hydroxylase cDNA was recovered and 
analyzed by standard procedures (Sambrook et al., 1989). The Arabidopsis B-carotene 
hydroxylase was sequenced completely on both strands on an automatic sequencer (Applied 
Biosystems, Model 373 A, Version 2.0. IS), The cDNA insert of 0.95kb also was excised 
and ligated into the a pTrcHis vector. A Bglll restriction site within the cDNA was used to 
remove that portion of the cDNA that encodes the predicted polypeptide N terminal sequence 
region that is not also found in bacterial P-carotene hydroxylases (Figure 6). A Bglll-Xhol 
fragment was directionally cloned in BamHI-XhoI digested TrcHis vectors. 

Pigment Analysis 

A single colony was used to inoculate 50 ml of LB containing ampicillin and 
chloramphenicol in a 250-ml flask. Cultures were incubated at 28°C for 36 hours with gentle 
shaking, and then harvested at 5000 rpm in an SS-34 rotor. The cells were washed once with 
distilled H 2 0 and resuspended with 0.5 ml of water. The extraction procedures and HPLC 
were essentially as described previously (Cunningham et al, 1994). 

II. Isolation and biochemical analysis of an Arabidopsis lycopene e-cyclase 
Plasmid Construction 

Construction of plasmids pAC-LYC, pAC-NEUR, and pAC-ZETA is described in 
Cunningham et al, (1994). In brief, the appropriate carotenoid biosynthetic genes from 
Erwinia herbicola, Rhodobacter capsulatus, and Synechococcus sp. strain PCC7942 were 
cloned in the plasmid vector pACYCl 84 (New England BioLabs, Beverly, MA). Cultures of 
E. coli containing the plasmids pAC-ZETA, pAC-NEUR, and pAC-LYC, accumulate C- 
carotene, neurosporene, and lycopene, respectively. The plasmid pAC-ZETA was 
constructed as follows: an 8.6-kb Bgin fragment containing the carotenoid biosynthetic genes 
of E. herbicola (GenBank M87280; Hundle et al, 1991) was obtained after partial digestion 
of plasmid pPL376 (Perry et al., 1986; Tuveson et al., 1986) and cloned in the BamHI site of 
pACYC184 to give the plasmid pAC-EHER. Deletion of adjacent 0.8- and Ll-kb BamHI- 
BamHI fragments (deletion Z in Cunningham et al., 1994), and of a 1.1 kB Sall-Sall fragment 
(deletion X) served to remove most of the coding regions for the E, herbicola P -carotene 
hydroxylase (crtZ gene) and zeaxanthin glucosyltransferase (crtX gene), respectively. The 
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resulting plasmid, pAC-BETA, retains functional genes for geranylgeranyl pyrophosphate 
synthase (crtE), phytoene synthase (crtB), phytoene desaturase (crtl), and lycopene cyclase 
(crtY). Cells of E. coli containing this plasmid form yellow colonies and accumulate f$- 
carotene. A plasmid containing both the lycopene €- and p -cyclase cDNAs of A. thaliana 
was constructed by excising the e-cyclase in clone y2 as a PvuI-PvuII fragment and ligating 
this piece in the SnaBI site of a plasmid (pSPORT 1 from GIBCO-BRL) that already 
contained the p -cyclase (Cunningham et al., 1996). 

Organisms and Growth Conditions 

E. coli strains TOP 10 and TOP 10 F (obtained from Invitrogen Corporation, San 
Diego, CA) and XLl-Blue (Stratagene) were grown in Luria-Bertani (LB) medium 
(Sambrook et al., 1989) at 37°C in darkness on a platform shaker at 225 cycles per min. 
Media components were from Difco (yeast extract and tryptone) or Sigma (NaCl). 
Ampicillin at 150 |ig/mL and/or chloramphenicol at 50 jig/mL (both from United States 
Biochemical Corporation) were used, as appropriate, for selection and maintenance of 
plasmids. 

Mass Excision and Color Complementation Screening of an A. thaliana cDNA TJhrary 

A size-fractionated 1-2 kB cDNA library of A, thaliana in lambda ZAPEt (Kieber et 
al., 1993) was obtained from the Arabidopsis Biological Resource Center at The Ohio State 
University (stock number CD4-14). Other size fractionated libraries were also obtained 
(stock numbers CD4-13, CD4-15, and CD4-16). An aliquot of each library was treated to 
cause a mass excision of the cDNAs and thereby produce a phagemid library according to the 
instructions provided by the supplier of the cloning vector (Stratagene; E. coli strain XLl- 
Blue and the helper phage R408 were used). The titre of the excised phagemid was 
determined and the library was introduced into a lycopene-accurnulating strain of E. coli 
TOP10 F (this strain contained the plasmid pAC-LYC) by incubation of the phagemid with 
the E. coli cells for 15 min at 37°C. Cells had been grown overnight at 30°C in LB medium 
supplemented with 2% (w/v) maltose and 10 mM MgS0 4 (final concentration), and harvested 
in 1.5 ml micro fuge tubes at a setting of 3 on an Eppendorf micro fuge (5415C) for 10 min. 
The pellets were resuspended in 10 mM MgS0 4 to a volume equal to one-half that of the 
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initial culture volume. Transformants were spread on large (150 mm diameter) LB agar petri 
plates containing antibiotics to provide for selection of cDNA clones (ampicillin) and 
maintenance of pAC-LYC (chloramphenicol). Approximately 10,000 colony forming units 
were spread on each plate. Petri plates were incubated at 37-C for 1 6 hr and then at room 
temperature for 2 to 7 days to allow maximum color development. Plates were screened 
visually with the aid of an illuminated 3x magnifier and a low power stage-dissecting 
microscope for the rare, pale pinkish-yellow to deep-yellow colonies that could be observed 
in the background of pink colonies. A colony color of yellow or pinkish-yellow was taken as 
presumptive evidence of a cyclization activity. These yellow colonies were collected with 
sterile toothpicks and used to inoculate 3ml of LB medium in culture tubes with overnight 
growth at 37°C and shaking at 225 cycles/min. Cultures were split into two aliquots in 
micro fuge tubes and harvested by centrifugation at a setting of 5 in an Eppendorf 5415C 
microfiige. After discarding the liquid, one pellet was frozen for later purification of plasmid 
DNA. To the second pellet was added 1.5 ml EtOH, and the pellet was resuspended by 
vortex mixing, and extraction was allowed to proceed in the dark for 15-30 min with 
occasional remixing. Insoluble materials were pelleted by centrifugation at maximum speed 
for 10 min in a microfiige. Absorption spectra of the supernatant fluids were recorded from 
350-550 nm with a Perkin Elmer lambda six spectrophotometer. 

Analysis of isolated clones 

Eight of the yellow colonies contained P -carotene indicating that a single gene 
product catalyzes both cyclizations required to form the two P endgroups of the symmetrical 
P -carotene from the symmetrical precursor lycopene. One of the yellow colonies contained a 
pigment with the spectrum characteristic of 8 -carotene, a monocyclic carotenoid with a single 
e endgroup. Unlike the P cyclase, this e-cyclase appears unable to carry out a second 
cyclization at the other end of the molecule. 

The observation that €-cyclase is unable to form two cyclic e-endgroups (e.g. the 
bicyclic e-carotene) illuminates the mechanism by which plants can coordinate and control 
the flow of substrate into carotenoids derived from P -carotene versus those derived from oc- 
carotene and also can prevent the formation of carotenoids with two e endgroups. 

The availability of the A. thaliana gene encoding the e-cyclase enables the directed 
manipulation of plant and algal species for modification of carotenoid content and 
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composition. Through inactivation of the e-cyclase, whether at the gene level by deletion of 
the gene or by insertional inactivation or by reduction of the amount of enzyme formed (by 
such as antisense technology), one may increase the formation of p -carotene and other 
pigments derived from it. Since vitamin A is derived only from carotenoids with p 
endgroups, an enhancement of the production of p -carotene versus a -carotene may enhance 
nutritional value of crop plants. Reduction of carotenoids with e-endgroups may also be of 
value in modifying the color properties of crop plants and specific tissues of these plants. 
Alternatively, where production of a -carotene, or pigments such as lutein that are derived 
from a -carotene, is desirable, whether for the color properties, nutritional value or other 
reason, one may overexpress the e-cyclase or express it in specific tissues. Wherever 
agronomic value of a crop is related to pigmentation provided by carotenoid pigments the 
directed manipulation of expression of the e-cyclase gene and/or production of the enzyme 
may be of commercial value. 

The predicted amino acid sequence of the A. thaliana e-cyclase enzyme was 
determined. A comparison of the amino acid sequences of the P- and e-cyclase enzymes of 
Arabidopsis thaliana (Fig. 13) as predicted by the DNA sequence of the respective cDNAs 
(Fig. 4 for the e-cyclase cDNA sequence), indicates that these two enzymes have many 
regions of sequence similarity, but they are only about 37% identical overall at the amino acid 
level. The degree of sequence identity at the DNA base level, only about 50%, is sufficiently 
low such that we and others have been unable to detect this gene by hybridization using the P 
cyclase as a probe in DNA gel blot experiments. 
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We claim: 

1 . An isolated and/or purified nucleic acid sequence which encodes for a protein having 
lycopene e-cyclase enzyme activity and has an amino acid sequence which is at least 85% 
identical to one of SEQ ID NOS: 23 or 25-27. 

2. The nucleic acid sequence of claim 1 , wherein the protein has the amino acid 
sequence of one of SEQ ED NOS: 23 or 25-27. 

3. A vector comprising the nucleic acid sequence of claim 1, wherein the nucleic acid 
sequence is operably linked to a promoter. 

4. A host cell which contains the vector of claim 3. 

5. The host cell of claim 4, wherein the host cell is selected from the group consisting of 
a bacterial cell, an algal cell, a yeast cell and a plant cell. 

6. The host cell of claim 4, wherein the host cell is a photosynthetic cell. 

7. An isolated and/or purified protein having lycopene e-cyclase enzyme activity and 
having an amino acid sequence which is at least 85% identical to one of SEQ ID NOS: 23 or 
25-27. 

8. The protein of claim 7, wherein the protein has the amino acid sequence of one of 
SEQ ID NOS: 23 or 25-27. 
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AMENDED CLAIMS 

[received by the International Bureau on 15 November 1999 (15.1 1 .99); 
original claims 1,2,7 and 8 amended; remaining claims unchanged (1 page)] 

1 . An isolated and/or purified nucleic acid sequence which encodes for a protein having 

lycopene e-cyclase enzyme activity and has an amino acid sequence which is at least 85% 
identical to one of SEQ ID NOS: 23, 25 or 26. 

2. The nucleic acid sequence of claim 1, wherein the protein has the amino acid 
sequence of one of SEQ ID NOS: 23, 25 or 26. 

3. A vector comprising the nucleic acid sequence of claim 1 , wherein the nucleic acid 
sequence is operably linked to a promoter, 

4. A host cell which contains the vector of claim 3. 

5. The host cell of claim 4, wherein the host cell is selected from the group consisting of 
a bacterial cell, an algal cell, a yeast cell and a plant cell. 

6. The host cell of claim 4, wherein the host cell is a photosynthetic cell. 

7. An isolated and/or purified protein having lycopene e-cyclase enzyme activity and 
having an amino acid sequence which is at least 85% identical to one of SEQ ID NOS: 23, 25 
or 26. 

8. The protein of claim 7, wherein the protein has the amino acid sequence of one of 
SEQ ID NOS: 23, 25 or 26. 
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Fl G. 4A 

Arabidopsis tholiana epsilon cyclase: 

acaaaaggaaataattag attcctctttctgcttgctataccttgaca 48 

gaacaacataacaatggtgtaagtcttctc gctgtattcgaaattatttggaggaggaac 108 

atqgagtgtgttggggctaggaatttcgca gcaatggcggtttcaacatttccgtcatgg 168 
ME C V b A OFA AMAVSTFPSW 

a^tt^tc^aa^aaatttccagtggctaag agataca^ctataggaatattcgcttcggt 228 

ttgt^ta^tgtcagagcta^cggcggcgga a^ttccggta^tgaga^ttgtgtagcggtg 288 

aqagaagatttcgctgacgaagaagatttt gcgaaagctgqcggttctgagattctattt 348 
61 RbUF7\L)EEDF VEAGGSRILF 

qttcaaatgcagcagaacaaagatatggat gaacagtctaagcttgttgataagttgcct 408 
81 V Q M Oil KDMD SQSKLVDKLP 

cclatatcaactggtgatggtgctttggat catgtggttactggctgtggtcctgctggt 468 
101 PISI^D^ALD KVVIGCGPAG 

ttaqccttqgctgcagaatcagctaagctt ggattaaaagttggactcattggtccagat 528 
121 L n LA AKSAKLbLKVGLIbPD 

cttccttttactaacaattacggtgtttgg gaagatgaattcaatgatcttgggctgcaa 588 
141 LPFTMMYGVM KDKFNDLGLG 

aaatgtattgaqcatgtttggagagagact attgcgcacctggatgatgacaagcctatt 648 
161 K? IIK VOTT I V Y L D D D K P I 

accattggccgtgcttatggaagagttaqt cqacgtttgctccatgaggagcttttgagg 708 
181 T ilIn^HS RRLLXEELLR 

aq^g^gtcgagt^g^gtctcgtacctt a^ctcgaaagttgaca^cataacagaagct 768 

tqtqatqqccttaaacttgttgcttgtgac gacaataacgtcattccctgcaggcttgcc 828 
221 SDlLUVACD DMMVIPCXLA 

actqttqcttctggagcagcttcggqaaag ctcttgcaatacgaagttgqtgqacctaqa 888 
241 T V 5 ? SIVA Si K LLQYXVGGPR 

gtctgtgcgcaaactgcatacggcgtggag gttgaggcggaaaatagtccatatgatcca 948 
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261 VCVQTAYGVX VXVXNSPYDP 

gatcaaatggttttcatggattacagagat tatactaacgagaaagttcggagcttagaa 1008 
281 DQMVPMDYRD YTMXXVRSLX 

gctgagtatccaacgtttctgtacgccatg cctatgacaaagtcaagactcttcttcgag 1068 
301 AKYPTFLYAM PMTKSRLFFK 

gagacatgtttggcctcaaaagatgtcatg ccctttgatttgctaaaaacgaagctcatg 1128 
321 KTCLASKDVM PFDLLKTKLM 



ttaagattagacacactcggaattcgaatt ctaaagacttacgaagaggagtggtcctat 1188 
341 IPVGGSLPNT XQKNLAFGAA 

atcccagttggtggttccttgccaaacacc gaacaaaagaatctcgcctttggtgctgcc 1248 
361 IPVGGSLPMT XQKNLAFGAA 

gctagcatggtacatcccgcaacaggctat tcagttgtgagatctttgtctgaagctcca 1308 
381 ASMVMPATGY SVVRSLSXAP 

aaacatgcatcagtcatcgcagagatacta agagaagagactaccaaacagattaacagt 1368 
401 KYASVIAKIL REETTKQINS 

aatatttcaagacaagcttaggatacttta tggccaccagaaaggaaaagacagagagca 1428 
421 MI5RQAWDTL WPPERXRQRA 

ttctttctctttggtcttgcactcagagtt caattcgataccgaaggcattagaagcttc 14® 
441 FFLFGLALIV QFDTX^IRSF 

ttccgtactttcttccgccttccaaaatgg atgtggcaagggtttctaggatcaacatta 1548 
461 FRTPFRLPKW MWQGFLfiSTL 

acatcaggagatctcgttctc±ttgcttta tacatgttcgtcatttcaccaaacaatttg 1608 
481 TSGDLVLFAL YMPVISPMML 

aqaaaa^tctcattaatcatctcatctct gatccaaccggagcaaccatgataaaaacc 1668 
501 R? G LINWLIS DPTGATMIKT 

tatctcaaagtatgatttacttaccaactc ttaggtttgtgtatatatatgccgatttat 1728 

521 Y L K V 

ctgaataatcgatcaaagaatggtatgtgg gttactaggaagttggaaacaaacacgtat 1788 

agaatctaaggagtgatcgaaatggagacg gaaacgaaaagaaaaaaatcagtctttgtt 1848 

ccgtggctagtg 1868 
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FIG. 5 

1 gctctttctc ctcctcctct accgatttcc gactccgcct cccgaaatcc 

51 ttatccggat tctctccgtc tcttcgattt aaacgctttt ctgtctgtta 

101 cgtcgtcgaa gaacggagac agaattctcc gattgagaac gatgagagac 

151 cggagagcac gagctccaca aacgctatag acgctgagta tctggcgttg 

201 cgtttggcgg agaaattgga gaggaagaaa tcggagaggt ccacttatct 

251 aatcgctgct atgttgtcga gctttggtat cacttctatg gctgttatgg 

301 ctgtttacta cagattctct tggcaaatgg agggaggtga gatctcaatg 

351 ttggaaatgt ttggtacatt tgctctctct gttggtgctg ctgttggtat 

401 ggaattctgg gcaagatggg ctcatagagc tctgtggcac gcttctctat 

451 ggaatatgca tgagtcacat cacaaaccaa gagaaggacc gtttgagcta 

501 aacgatgttt ttgctatagt gaacgctggt ccagcgattg gtctcctctc 

551 ttatggattc ttcaataaag gactcgttcc tggtctctgc tttggcgccg 

601 ggttaggcat aacggtgttt ggaatcgcct acatgtttgt ccacgatggt 

651 ctcgtgcaca agcgtttccc tgtaggtccc atcgccgacg tcccttacct 

701 ccgaaaggtc gccgccgctc accagctaca tcacacagac aagttcaatg 

751 gtgtaccata tggactgttt cttggaccca aggaattgga agaagttgga 

801 ggaaatgaag agttagataa ggagattagt cggagaatca aatcatacaa 

851 aaaggcctcg ggctccgggt cgagttcgag ttcttgactt taaacaagtt 

901 ttaaatccca aattcttttt ttgtcttctg tcattatgat catcttaaga 

951 cggtct 
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FIG. 7 
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tttt^atrfcca 


actccaagta 


tgagttgctt 


cticcagcaac 
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ggtcaaaaac 


aaagg-t1:ac*t 


trfceccacfrtg 


tgtggacaaa 


cacttc^tgc 


351 


agccatccric 


tttaccgrga 


atccgagctt 


attgaagaga 


atg-gcrtgg 


401 


tgrtaagaaat 


gccgcacaaa 


ggaagctttit 


cgatgagcrtc 


gg^tattg^ag 


451 


cagaagatgt: 


accagticgat 


gagftcactc 


ccttigggacg 


catgctttac 
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aaggcacct/t 


ctgatgggaa 


arggggagag 
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act:atcract 
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c uu4 ucg^g 
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agcttcaacc 


aaacccagat 


gaagrggcrg 


601 


agatcaagta 


cgHgagcagg 


gaagagctta 


aggagcrtgg^ 


gaagaaagca 
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gatgetggcg 


atgaagctrgt 


gaaact.atct: 


c ca t g gti'w ca 


gatligg^ggr 
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ggataatttc 
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ggtigggatca 


tgttgagaaa 


ggaacratca 
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ctgaagctgc 


agacatgaaa 


accattcaca 


agctctgaac 


tttccataag 


801 


wtttggatct 


tcccctitccc 


ataataaaat 


taagagatga 


gactitirrart: 


851 


gattacagac 


aaaac^ggca 


acaaaatcta 


ttcctaggat 




901 


rttttattta 


citttgatnc 


atctctagtt 


tiag'w'wti'wCa w 


cttaaaaaaa 


951 


aaaa 
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FIG. 8 

1 caccaatgtc tgtctcttct ttatttaatc tcccarrgat tcgcctcaga 
51 tctctcgctc ttticgtcrttic ttitrtticirtei: ^rccGATTTG cccatcgtcc 
101 TCTGTCATCG ATTTCACCGA GAAAGTTACC GAATTTTGGT GCTTTCTCTG 
151 GTACCGCTAT GACAGATACT AAAGATGCTG GTATGGATGC TGTTCAGAGA 
201 CGTCTCATGT TTGAGGATGA ATGCATTCTT GTTGATGAAA CTGATCG7C7 
251 TGTGGGGCAT GTCAGCAAGT ATAATTGTCA 7C7GA7GGAA AATATTGAAG 
301 CCAAGAATTT GCTGCACAGG GCTTTTAGTG TATTTTTATT CAAC7CGAAG 
3 51 TATGAGTTGC TTCTCCAGCA AAGGTCAAAC ACAAAGGTTA CGTTCCCTC7 
401 AGTGTGGACT AACACTTGTT GCAGCCATCC 7C7TTACCG7 GAATCAGAGC 
451 TTATCCAGGA CAA7GCACTA GGTGTGAGGA ATGCTGCACA AAGAAAGCTT 
501 CTCGATGAGC TTGGTAXTGT AG CTGAAG A7 G7ACCAG7CG ATGAGTTCAC 
551 TCCCTTGGGA CGTATGCTGT ACAAGGCTCC TTCTGATGGC AAATGGGGAG 
601 AGCATGAACT TGATTACTTG CTCTTCATCG 7GCGAGACGT GAAGGTTCAA 
651 CCAAACCCAG ATGAAGTAGC TGAGATCAAG 7ATGTGAGCC GGGAAGAGCT 
7 01 GAAGGAGCTG GTGAAGAAAG CAGATGCAGG TGAGGAAGGT TTGAAAC7G7 
751 CACCATGGTT CAGATTGGTG GTGGACAATT TCTTGATGAA GTGG7CGOA7 
801 CATGTTGAGA AAGGAACTTT GGTTGAAGCT A7AGACA7GA AAACCATCCA 
851 CAAACTCTGA ACATCTTTTT TTAAAGTTTT 7AAATCAATC AAC7T7C7C7 
901 TCATCATTTT TA7CTTTTCG ATGATAATAA 77TGGGATA7 G7GAGACAC7 
951 TACAAAACTT CCAAGCACCT CAGGCAATAA 7AAAGTTTGC GGCCGC 
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FIG. 9 

1 CTCGGTAGCT GGCCACAATC GCTATTTGGA ACCTGGCCCG GCGGCAGTCC 
51 GATGCCGCGA TGCTTCGTTC GTTGCTCAGA GGCCTCACGC ATATCCCCCG 
101 CGTGAACTCC GCCCAGCAGC CCAGCTGTGC ACACGCGCGA CTCCAGT7TA 
151 AGCTCAGGAG CATGCAGATG ACGCTCATGC AGCCCAGCAT CTCAGCCAAT 
201 CTGTCGCGCG CCGAGGACCG CACAGACCAC ATGAGGGGTG CAAGCACCTG 
251 GGCAGGCGGC CAGTCGCAGG ATGAGCTGAT GCTGAA&GAC GAGTGCATCT 
30 1 TGGTGGATGT TGAGGACAAC ATCACAGGCC ATGCCAGCAA GCTGGA"GTGT 
251 CACAAGTTCC TACCACATCA GCCTGCAGGC CTGCTGCACC GGGCCTTCTC 
401 TGTGTTCCTG TTTGACGATC AGGGGCGACT GCTGCTGCAA CAGCG7GCAC 
451 GCTCAAAAAT CACCTTCCCA. AGTGTGTGGA CGAACACCTG CTGCAGCCAC 
501 CCTTTACATG GGCAGACCCC AGATGAGGTG GACCAACXAA GCCAGGTGGC 
551 CGACGGAACA GTACCTGCCG CAAAGGCTGC TGCCATCCGC AAGTTGGAGC 
601 ACGAGCTGGG GATACCAGCG CACCAGCTGC CGGCAAGCGC GTTTCGCTTC 
651 CTCACCCCTT TGCACTACTG TGCCGCGGAG GTGCAGCCAG CTGCCACACA 
701 ATCAGCGCTC TGGGGCGAGC ACGAAATGGA CTACATCTTG TTCATCCCGG 
751 CCAACGTCAC CTTGGCGCCC AACCCTGACG AGGTGGACGA AGTCAGGTAC 
8 01 GTGACGCAAG AGGAGCTGCG GCAGATGATG CAGCCGGACA ACGGGCTGCA 
851 ATGGTCGCCG TGGTTTCGCA TCATCGCCGC GCGCTTCCTT GAGCGTTGGT 
901 GGGCTGACCT GGACGCGGCC CTAAACACTG ACAAACACGA GGATTGGGGA 
951 ACGGTGCATC ACATCAACGA AGCGTGAAAG CAGAAGCTGC AGGATGTGAA 
1001 GACACGTCAT GGGGTGGAAT TGCGTACTTG GCAGCTTCGT ATCTCCTTTT 
1051 TCTCACACTG AACCTGCAGT CAGGTCCCAC AAGGTCAGGT AAAATGGCTC 
1101 GATAAAATGT ACCGTCACTT TTTGTCGCGT ATACTGAACT CCAAGAGGTC 
1151 AAAAAAAAAA AAAAA 
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FIG. 10 

1 CTCGGTAGCT GGCCACAATC GCTATTTGGA ACCTGGCCCG GCGGCAGTCC 
51 GATGCCGCGA TGCTTCCTTC GTTGCTCAGA GGCCTCACGC ATATCCCGCG 
10 L CGTGAACTCC GCCCAGCAGC CCAGCTGTGC ACACGCGCGA CTCCAGTTTA 
151 AGCTCAGGAG CATGCAGCTG CTTTCCGAGG ACCGCACAGA CCACA7GAGG 
201 GGTGCAAGCA CCTGGGCAGG CGGGCAGTCG CAGGATGAGC TGATGCTGAA 
251 GGACCAGTGC ATCTTGGTAG ATGTTGAGGA CAACATCACA CGCCA7GCCA 
3 01 GCAAGCTGGA CTCTCACAAG TTCCTACCAC ATCAGCCTGC AGGCCTGC7G 
351 CACCGGGCCT TCTCTGTGTT CCTGTTTGAC GATCAGGGGC GACTCCTGCT 
401 GCAACAGCGT GCACGCTCAA AAATCACCTT CCCAAGTGTG TGGACGAACA 
451 CCTGCTGCAG CCACCCTT7A CATGGGCAGA CCCCAGATGA GGTGGACCAA 
501 CTAAGCCAGG TGGCCGACGG AACAGTACCT GGCGCAAACG CTGCTGCCA7 
551 CCGCAAGTTG GAGCACGAGC TGGGGA7ACC AGCGCACCAG C7GCCGGCAA 
601 GCGCGTTT CG CTTCCTCACG CGTTTGCACT ACTGTGCCGC GGACGTGCAG 
651 CCAGCTGCGA CACAATCAGC GCTCTGGGGC GAGCACGAAA 7GGACTACA7 
701 CTTGTTCATC CGGGCCAACG 7CACCT7GGC GCCCAACCC7 GACGAGG7GG 
751 ACGAAGTCAG GTACG7GACG CAAGAGGAGC 7GCGGCAGA7 GA7GCAGCCG 
801 GACAACGGGC TTCAATGG7C GCCG7GG7TT CGCA7CATCG CCGCGCGCTT 
851 CCTTGAGCGT 7GGTGGGCTG ACC7GGACGC GGCCC7AAAC ACTGACAAAC 
901 ACGAGGATTG GGGAACGG7G CA7CACA7CA ACGAAGCGTG AAGGCAGAAG 
951 CTGCAGGA7G 7GAAGACACG TCA7GGGG7G GAATTGCG7A C7TGGCAGC7 
1001 TCGTATCTCC TTTTTCTGAG AC7GAACCTG CAGAGC7AGA G7CAA7GG7G 
1051 CAT CAT ATT C ATCGTCTCTC TTTTGTTTTA GACTAATC7G TAGCTAGAGT 
1101 CACTGATGAA TCCTT7ACAA CTTTCAAAAA AAAAA 
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FIG. I I A 



i 



HP04 MLRSLLRGLT HIPRVNSAQQ 
HP05 MLRSLLRGLT HIPRVNSAQQ 
ATDP7 MSVSSLFNLP .LIRLRSLA. 
C.brew. MS.SSMLNFT .ASRIVSLPL 

AT0P5 TGPPPRFFP 

S.cerev. . .MTADNNSM PHGAVSSYAK 



51 

AEDRTDHMRG 
SEDRTDHMRG 
S..GTA.MTD 
SSQATT.MGE 

T.MTD 

SSETSNDESG 

101 

LPHQPAGLLH 
LPHQPAGLLH 
ENIEAKNLLH 
ENIESENLLH 
EKIEAENLLH 
ENIE.KGLLH 



ASTWAGGQSQ 
ASTWAGGQSQ 
TKDAGMDAVQ 
VVDAGMDAVQ 
SNDAGMDAVQ 
ETCFSGHDEE 



PSCAHARLQF 
PSCAHARLQF 
LSSSFSSFRF 
LSSPPSRVHL 
IRSPVPRTQL 
LVQNQTPEDI 



DELMLKDECI 
DELMLKDECI 
RRLMFEDECI 
RRLMFEDECI 
RRLMFEDECI 
QIKLMNENCI 



50 

KLRSMQMTLM QPSISANLSR 

KLRSMQLL 

AHRPLSSIS. PRKLPNFRAF 
PLCFFSPISL TQRFSAKLTF 

FVRAFSAV 

LEEFPEIIPL QQRPN...TR 



RAFSVFLFDD 
RAFSVFLFDD 
RAFSVFLFNS 
RAFSVFLFNS 
RAFSVFLFNS 
RAFSVFIFNE 



QGRLLLQQRA 
QGRLLLQQRA 
KYELLLQQRS 
KYELLLQQRS 
KYELLLQQRS 
QGELLLQQRA 



LVDVEDNITG 
LVDVEDNITG 
LVDETDRVVG 
LVDENDKVVG 
LVDENNRVVG 
VLDWDDNAIG 



RSKITFPSVW 
RSKITFPSVW 
NTKVTFPLVW 
ATKVTFPLVW 
KTKVTFPLVW 
TEKITFPDLW 



151 

GQTPDEVDQL SQVADGTVPG 
GQTPDEVDQL SQVADGTVPG 

RE SELIQDNALG 

RE SELIDENCLG 

RE SELIEENVLG 

ID...DELGL KGKLDDKIKG 

201 

RLHYCAADVQ PAATQSALWG 
RLHYCAADVQ PAATQSALWG 

RMLY KAPSDGKWG 

RILY KAPSDGKWG 

RMLY KAPSDGKWG 

RIHY MAPSNEPWG 



AKAAAIRKLE 
AKAAAIRKLE 
VRNAAQRKLL 
VRNAAQRKLL 
VRNAAQRKLF 
AITAAVRKLD 



EHEMDYILFI 
EHEMDYILFI 
EHELDYLLFI 
EHELDYLLFI 
EHEVDYLLFI 
EHEIDYILFY 



HELGIPAHQL 
HELGIPAHQL 
DELGIVAEDV 
DELGIPAEDL 
DELGIVAEDV 
HELGIPEDET 



100 

HASKLECHKF 
HASKLECHKF 
HVSKYNCHLM 
HESKYNCHLM 
HDTKYNCHLM 
AGTKKVCHLM 

150 

TNTCCSHPLH 
TNTCCSHPLH 
TNTCCSHPLY 
TNTCCSHPLY 
TNTCCSHPLY 
TNTCCSHPLC 

200 

PA.SAFRFLT 
PA.SAFRFLT 
PV.DEFTPLG 
PV.DQFIPLS 
PV.DEFTPLG 
KTRGKFHFLN 



250 

RANVTL APNPDEVDEV 

RANVTL APNPDEVDEV 

....VRDVKV QPNPDEVAEI 

IRDVNL DPNPDEVAEV 

VRDVKL QPNPDEVAEI 

KINAKENLTV NPNVNEVRDF 
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251 300 
RYVTQEELRQ MMQ....PDN GLQWSPWFRI IAARFLERWW ADLDAALNTD 

RYVTQEELRQ NMQ PDN GLQWSPWFRI IAARFLERWW ADLDAALNTD 

KYVSREELKE LVKKADAGEE GLKLSPWFRL VVDNFLMKWW DHVEKGTLVE 
KYMNRDDLKE LLRKADAEEE GVKLSPWFRL VVDNFLFKWW DHVEKGSLKD 
KYVSREELKE LVKKADAGDE AVKLSPWFRL VVDNFLMKWW DHVEKGTITE 
KWVSPNDLKT MF ADP SYKFTPWFKI ICENYLFNWW EQLDDLSEVE 

301 

KHEDWGTVHH INEA* 
KHEDWGTVHH INEA* 
A.IDMKTIHK L* 
A.ADMKTIHK L* 
A.ADMKTIHK L* 
A.ADMKTIHK L* 
NDRQ...IHR ML* 
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FIG. 12 
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ctccgtcgct 


cttacUccgc 


catgggrgac 
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attgtcacrt 


gatggagaag 


attgaaacag 
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gcacagagca 


201 


trcagcgrtt: 


tuctattcaa 


trcaaaatac 


gagtracttc 


tticagcaacg 


251 


gtictgcaacc 


aaggngacat 


tzcazzzzqt 


atggaccaac 


acctgtitgca 


201 


gccatccact 


ctacagagaa 


tccgagctLUg 
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gcctgagaga 
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xxxxxxxxxx 
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ttgatggttt 


gcaattticaa 


qttcztztcq 
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atctaaaaaa 
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FIG. I4A 

Adonis palaestina £ -cyclase cDNA #5 Length: 1898 

1 aaaggagtgt tctattaatg ttactgtcgc attcttgcaa cacttatatt 

51 caaactccat tttcttcttt tctcttcaaa acaacaaact aatgtgagca 

101 gagtatctgg ctatggaact acttggtgtt cgcaacctca tctcttcttg 

151 ccctgtgtgg acttttggaa caagaaacct tagtagttca aaactagctt 

201 ataacataca tcgatatggt tcttcttgta gagtagattt tcaagtgaga 

251 gctgatggtg gaagcgggag tagaagttct gttgcttata aagagggttt 

301 tgtggatgaa gaggatttta tcaaagctgg tggttctgag cttttgtttg 

351 tccaaatgca gcaaacaaag tctatggaga aacaggccaa gctcgccgat 

401 aagttgccac caataccttt tggagaatcc gtgatggact tggttgtaat 

451 aggttgtgga cctgctggtc tttcactggc tgcagaagct gctaagctag 

501 ggttgaaagt tggccttatt ggtcctgatc ttccttttac aaataattat 

551 ggtgtgtggg aagacgagtt caaagatctt ggacttgaac gttgtatcga 

601 gcatgcttgg aaggacacca tcgtatatct tgataatgat gctcctgtcc 

651 ttattggtcg tgcatatgga cgagttagtc gacatttgct acatgaggag 

701 ttgctgaaaa ggtgtgtgga gtcaggtgta tcatatctgg attctaaagt 

751 ggaaaggatc actgaagctg gtgatggcca tagccttgta gtttgtgaaa 

801 atgagatctt tatcccttgc aggcttgcta ctgttgcatc tggagcagct 

851 tcagggaaac ttttggagta tgaagtaggt ggccctcgtg tttgtgtcca 

901 aaccgcttat ggggtggagg ttgaggtgga gaacaatcca tacgatccca 

951 acttaatggt attcatggac tacagagact atatgcaaca gaaattacag 

1001 tgctcggaag aagaatatcc aacatttctC tatgtcatgc ccatgtcgcc 

1051 aacaagactt ttttttgagg aaacctgttt ggcctcaaaa gatgccatgc 

1101 cattcgatct actgaagaga aaactgatgt cacgattgaa gactctgggt 

1151 atccaagtta caaaagttta tgaagaggaa tggtcatata ttcctgttgg 

1201 tggttcttta ccaaacacag agcaaaagaa cctagcattt ggtgctgcag 

1251 caagcatggt gcatccagca acaggctatt cggttgtacg gtcactgtca 

1301 gaagctccaa aatatgcttc tgtaattgca aagattttga agcaagataa 

1351 ctctgcgtat gtggtttctg gacaaagtag tgcagtaaac atttcaatgc 

1401 aagcatggag cagtctttgg ccaaaggagc gaaaacgtca aagagcatTc 

1451 tttcttttTg gattagagct tattgtgcag ctagatattg aagcaaccag 

1501 aacattcttt agaaccttct tccgcttgcc aacttggatg tggtggggtt 

1551 tccttgggtc ttcactatca tctttcgatc tcgtcttgtt ttccatgtac 

1601 atgtttgttt tggcgccaaa cagcatgagg atgtcacttg tgagacattt 

1651 gctttcagat ccttctggtg cagttatggt aagagcttac ctcgaaaggt 

1701 agtctcatct attattaaac tqtagtgttt caccaaataa atgaggatcc 

1751 ttcgaatgtg tatatgatca tctctatgta tatcctgtac tctaatctca 

1801 taaagtaaat gccgggtttg atattgttgt gtcaaaccgg ccaatgatat 

1851 aaagtaaatt tattgataca aaagtagttt ttttccttaa aaaaaaaa 
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FIG. I4B 



Adonis palaestina £ -cyclase #5 predicted polypeptide 
TRANSLATE from: 113 to: 1702 Length: 529 amino acids 

1 MELLGVRNLI SSCPWflTGT RNLSSSKLAY NIHRYGSSCR VDFQVRADGG 

51 SGSRSSVAYK EGFVDEEDFI KAGGSELLFV QMQQTKSMEK QAKLADKLPP 

' 101 IPFGESVMDL VVIGCGPAGL SLAAEAAKLG LKVGLIGPDL PFTNNYGVWE 

151 DEFKDLGLER CIEHAWKDTI VYLDNDAPVL IGRAYGRVSR HLLHEELLKR 

201 CVESGVSYLD SKVERITEAG DGHSLVVCEN EIFIPCRLAT VASGMSGKL 

251 LEYEVGGPRV CVQTAYGVEV EVENNPYDPN LMVFMDYRDY MQQKLQCSEE 

301 EYPTFLYVMP MSPTRLFFEE TCLASKDAMP FDLLKRKLMS RLKTLGIQVT 

351 KVYEEEWSYI PVGGSLPNTE QKNLAFGAAA SMVHPATGYS VVRSLSEAPK 

401 YASVIAKILK QDNSAYVVSG QSSAVNISMQ AWSSLWPKER KRQRAFFLFG 

451 LELIVQLDIE ATRTFFRTFF RLPTWMWWGF LGSSLSSFDL VLFSMYMFVL 

501 APNSMRMSLV RHLLSDPSGA VMVRAYLER* 
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FIG. I5A 



DNA sequence of potato cDNA (GenBank R27545) obtained frcm Nicholas J. 
Provart 

potato. seq Length: 1378 August 2, 1996 13:06 Type: N Check: 605 .. 

1 tagcggnnnn naggatgagt tcaaagatct tggtcttcaa gcctgcattg 

51 aacatgtttg gcgggatacc attgtatatc ttgatgatga tgatcctatt 

101 cttattggcc gtgcctatgg aagagttagt cgccatttac tgcacgagga 

151 gttactcaaa aggtgtgtgg aggcaggtgt tttgtatcta aactcgaaag 

201 tggataggat tgttgaggcc acaaatggcc acagtcttgt agagtgcgag 

251 ggtgatgttg tgattccctg caggtttgtg actgttgcat cgggagcagc 

301 ctcggggaaa ttcttgcagt atgagttggg aggtcctaga gtttctgttc 

351 aaacagctta tggagtggaa gttgaggtcg ataacaatcc atttgacccg 

401 agcctgatgg ttttcatgga ttatagagac tatgtcagac acgacgctca 

451 atctttagaa gctaaatatc caacatttct ctatgccatg cccatgtctc 

501 caacacgagt ctttttcgag gaaacttgtt tggcttcaaa agatgcaatg 

551 ccattcgatc tgttaaagaa aaaattgatg ttacgattga acaccctcgg 

601 tgtaagaatt aaagaaattt atgaggagga atggtcttac ataccagttg 

651 gaggatcttt gccaaataca gaacaaaaaa cacttgcatt tggtgctgct 

701 gctagcatgg ttcatccagc cacaggttat tcagtcgtca gatcactgtc 

751 tgaagctcca aaatgcgcct tcgtgcttgc aaatatatta cgacaaaatc 

801 atagcaagaa tatgcttact agttcaagta ccccgagtat ttcaactcaa 

851 gcttggaaca ctctttggcc acaagaacga aaacgacaaa gatcgttttt 

901 cctatttgga ctggctctga tattgcagct ggatattgag gggataaggt 

951 catttttccg cgcgttcttc cgtgtgccaa aatggatgtg gcagggattt 

1001 cttggttcaa gtctttcttn agcagacctc atgttatttg ccttctacat 

1051 gtttattatt gcaccaaatg acatgagaag aggcttaatc agacatcttt 

1101 tatctgatcc tactggtgca acattgataa gaacttatct tacattttag 

1151 agtaaattcc tcctacaata gttgttgaan nagaggcctc attacttcag 

1201 attcataaca gaaatcgcgg tctctcgagg ccttgtatat aacattttca 

1251 ctaggttaat attgcttgaa taagttgcac agtttcagtt tttgtatctg 

1301 cttctttttt gtccaagatc atgtattgan ccaatttata tacattgcca 

1351 gtatatataa attttataaa aaaaaaaa 

poteps.pep Length: 378 TRANSLATE from: 14 to: 1147 

1 DEFKDLGLQA CIEHVWRDTI VYLDDDDPIL IGRAYGRVSR HLLHEELLKR 

51 CVEAGVLYLN SKVDRIVEAT NGHSLVECEG DVVIPCRFVT VASGAASGKF 

101 LQYELGGPRV SVQTAYGVEV EVDNNPFDPS LMVFMDYRDY VRHDAQSLEA 

151 KYPTFLYAMP MSPTRVFFEE TCLASKDAMP FDLLKKKLML RLNTLGVRIK 

201 EIYEEEWSYI PYGGSLPNTE QKTLAFGAAA SMVHPATGYS VVRSLSEAPK 

251 CAFVLANILR QNHSKNMLTS SSTPSISTQA WNTLWPQERK RQRSFFLFGL 

301 ALILQLDIEG IRSFFRAFFR VPKkMWQGFL GSSLSXADLM LFAFYMF I IA 

351 PNDMRRGLIR HLLSDPTGAT LIRTYLTF* 
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FIG. I5B 



Chimeric lettuce^otato lycopene e -cyclase: converts lycopene to 5- 
carotene, the lettuce cDNA converts lycopene to £ -carotene and the 
potato cDNA does not produce an active enzyre 

(amino acids in lower case are from lettuce and those in uppercase 
are frcm the potato cDNA; an Avail site in camron to the two cDNAs 
was used to construct the chimera) 

1 mecfgarnmt atmavftcpt ftdcnirhkf sllkqrrftn Isassslrqi 

51 kcsaksdrcv vdkqgisvac eedyvkaggs elffvqmqrt ksmesqskls 

101 eklaqipign cildlvvigc gpaglalaae saklglnvgl igpdlpftnn 

151 ygvwqdefig lglegciehs wkdtlvyldd adpirigray grvhrdllhe 

201 ellrrcvesg vsylsskver iteapngysl iecegnitip crlatvasga 

251 asgkf 1 eye! gGPRVSVQTA YGVEVEVDNN PFDPSLMVFM DYRDYVRHDA 

301 QSLEAKYPTF LYAMPMSPTR VFFEETCLAS KDAMPFDLLK KKLMLRLNTL 

351 GVRIKEIYEE EWSYIPVGGS LPNTEQKTLA FGAAASMVHP ATGYSVVRSL 

401 SEAPKCAFVL ANILRQNHSK NMLTSSSTPS ISTQAWNTLW PQERKRQRSF 

451 FLFGLALILQ LDIEGIRSFF RAFFRVPKWM WQGFLGSSLS XADLMLFAFY 

501 MFIIAPNDMR RGLIRHLLSD PTGATLIRTY LTF* 
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FIG. 16 

G4P ccnparison of Arabidopsis £ -cyclase x potato £ -cyclase (partial) 
blosum62.cmp Gap Weight: 12 Average Match: 2.912 
Length Weight: 4 Average Mismatch: -2.003 
Quality: 1485 Length: 529 

Ratio: 3.929 Gaps: 1 

Percent Similarity: 79.893 Percent Identity: 76.139 
Match display thresholds for the alignment(s) : 
| = IDENTITY : = 2 . = 1 

151 EDEFNDLGLQKCIEHVWRETIVYLDDDKPITIGRAYGRVSRRLLHEELLR 200 

II! Mill HllllhlllllMI II IIIIMllil MUNI: 

1 . DEFKDLGLQACI EHVWRDTI VYLDDDDPI LIGRAYGRVSRHLLHEELLK 49 " 
201 RCVESGVSYLSSKVDSITEASDGLRLVACDDNNVI PC RLATVASGAASGK 250 

50 lOEAGVLYU^I^ 99 

251 LLQYEVGGPRVCVQTAYGVEVEVENSPYDPDQMVFMDYRDYTNEKVRSLE 300 

HIMIIII illl!!IIIM:M:|l IMIIIIM -11! 
100 FLQYELGGPRVSVQTAYGVEVEVDNNPFDPSLMVEMDYRDYVRHDAQSLE 149 

301 AEYPTFLYAMPMTKSRLFFEETCLASKDVMPFDLLKTKLMLRLDTLGIRI 350 

. • • IIIIMI 

150 AKYPTFLYAMPMSPTRVFFEETCLASKDAMPFDLLKKKLMLRLNTLGVRI 199 

351 LKTYEEEWSYIPVGGSLPNTEQKNLAFGAAASMVHPATGYSVVRSLSEAP 400 

. IIIIIIIIIMIIIMIIII illlMIIIIIMIIIIIIIIIIill 
200 KEIYEEEWSYIPVGGSLPNTEQKTLAFGAAASMVHPATGYSVVRSLSEAP 249 

401 KYASVIAEILREETTKQI .... .NSNISRQAWDTLWPPERKRQRAFFLFG 445 

| | |:| |||: .| . -II Ill-Mil IIMII-IIIII 

250 KCAFVLANILRQNHSKNMLTSSSTPSISTQAWNTLWPQERKRQRSFFLFG 299 

446 LALIVQFDTEGIRSFFRTFFRLPKWWQGFLGSTLTSGDLVLFALYMFVI 495 

||||. I I llllllll IIUIIIIIHIIM- M-lll MM 
300 IjALILQLDIEGIRSFFRAFFRVPKWMWQGFLGSSLSXADLMLFAFYMFII 349 

496 SPNNLRKGLINHLISDPTGA™iKTYLKV 524 

.||.:hlll lhllllllh|:lll 
350 APNDMRRGLIRHLLSDPTGATLIRTYLTF 378 
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FIG. I7A 



Adonis palaestina Ipil 

1 attcatcttc agcagcgctg tcgtactctt tctatatctt cttccatcac 

51 taacagtagt cgccgacggt tgaatcggct attcgcctca acgtcaacta 

101 tgggtgaagt cactgatgct ggaatggatg ctgttcagaa gcggctcatg 

151 ttcgacgacg aatgtatttt ggtggatgag aatgacaagg tcgtcgggca 

201 tgattccaaa tacaactgtc atttgatgga aaagatagag gcagaaaatt 

251 tgcttcacag agccttcagt gttttcttgt tcaactcaaa atatgaattg 

301 cttcttcagc aacgatccgc cacaaaggta acattcccgc tcgtatggac 

351 aaacacatgt tgcagtcatc ctctctttcg tgattccgag ctcatagaag 

401 aaaattatct cggtgtacga aacgctgcac aaagaaagct tttagacgag 

451 ctaggcattc cagctgaaga tgtcccagtt gatgaattta ctcctcttgg 

501 tcgcattctt tacaaagctc catctgacgg caaatgggga gagcacgaat 

551 tggactatct cctatttatt gtccgagatg tgaaatacga tccaaaccca 

601 gatgaagttg ctgatgctaa gtatgttaat cgcgaggagt tgagagagat 

651 actgagaaaa gctgatgctg gtgaagaggg actcaagttg tctccttggt 

701 ttagattggt tgttgataac tttttgttca agtggtggga tcatgtagag 

751 cagggtacga ttaaggaagt tgctgacatg aaaactatcc acaagttgac 

801 ttaagaggac ttctctcctc tgttctacta tttgtttttt gctacaataa 

851 gtgggtggtg ataagcagtt tttctgtttt ctttaattta tggcttttga 

901 atttgcctcg atgttgaact tgtaacatat ttagacaaat atgagacctt 

951 gtaagttgaa tttgaggctg aatttatatt tttgggaaca taataatgtt 

1001 aa 
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Adonis palaestina Ipi2 

1 ttttaaagct ctttcgctcc 

51 tacaaaagtt aaaaacaccg 

101 ccttgtttac gatacgcatc 

151 tcctaaatta ggcccctttc 

201 gattaatcct ttatatagta 

251 cttcgtgttc ttctcccgct 

301 tctatttctt cttccatcac 

351 gttcgcctca acgtcgacta 

401 ccgtccagaa gcggcttatg 

451 aatgacaagg tcgtcggaca 

501 aaagatagag gcagaaaact 

551 tcaactcaaa atacgagttg 

601 acattcccgc tcgtatggac 

651 tgattccgaa ctcatagaag 

701 aaaggaagct tttagacgag 

751 gatgaattca ctcctcttgg 

801 aaaatgggga gagcacgaac 

851 tgaaatacga tccaaaccca 

901 cgcgaggagt tgaaagagat 

951 aataaagttg tctccttggt 

1001 agtggtggga tcatgtagag 

1051 aaaactatcc acaagttgac 

1101 tggtttttgc ttcaataagt 

1151 ttaattttgg cttttcaatt 

1201 agtcaaatat gagaccttgt 

1251 tgggaacata aaaaaaaaaa 



IG. I7B 



accaccatca aagccagcca aatttctctg 
ctttgggctt tggcccctcc atatcggaat 
taaaccagta attctcggtt ttaatttgtt 
cggaatcccg agaattatgt cgtcgatcag 
tcttctccac caccactaaa acattatcag 
gttcatcttc agcagcgttg tcgtactctt 
taacagtcct cgccgagggt tgaatcggct 
tgggtgaagt cgctgatgct ggtatggatg 
ttcgacgatg aatgtatttt ggtggatgag 
tgattccaaa tacaactgtc atttgatgga 
tgcttcacag agccttcagt gttttcttat 
cttcttcagc aacgatctgc aacgaaggta 
aaacacctgt tgcagccatc ccctcttccg 
aaaattttct cggggtacga aacgctgcac 
ctaggcattc cagctgaaga cgtaccagtt 
tcgcattctt tacaaagctc catctgacgg 
tggactatct tctgtttatt gtccgagatg 
gatgaagttg ctgacgctaa gtacgttaat 
actgagaaaa gctgatgcag gtgaagaggg 
ttagattggt tgtggataac tttttgttca 
gaggggaaga ttaaggacgt cgccgacatg 
ttaagagaaa gtctcttaag ttctactatt 
ggatggtgat gagcagtttt tatgcttcct 
tgctttatgt gttgaacttg taacatattt 
gagttgaatt tgaggttata tttatagttt 
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FIG. I8A 



Haematococcus pluvialis Ipil 

1 ctcggtagct ggccacaatc gctatttgga acctggcccg gcggcagtcc 

51 gatgccgcga tgcttcgttc gttgctcaga ggcctcacgc atatcccccg 

101 cgtgaactcc gcccagcagc ccagctgtgc acacgcgcga ctccagttta 

151 agctcaggag catgcagatg acgctcatgc agcccagcat ctcagccaat 

201 ctgtcgcgcg ccgaggaccg cacagaccac atgaggggtg caagcacctg 

251 ggcaggcggg cagtcgcagg atgagctgat gctgaaggac gagtgcatct 

301 tggtggatgt tgaggacaac atcacaggcc atgccagcaa gctggagtgt 

351 cacaagttcc taccacatca gcctgcaggc ctgctgcacc gggccttctc 

401 tgtgttcctg tttgacgatc aggggcgact gctgctgcaa cagcgtgcac 

451 gctcaaaaat caccttccca agtgtgtgga cgaacacctg ctgcagccac 

501 cctttacatg ggcagacccc agatgaggtg gaccaactaa gccaggtggc 

551 cgacggaaca gtacctggcg caaaggctgc tgccatccgc aagttggagc 

601 acgagctggg gataccagcg caccagctgc cggcaagcgc gtttcgcttc 

651 ctcacgcgtt tgcactactg tgccgcggac gtgcagccag ctgcgacaca 

701 atcagcgctc tggggcgagc acgaaatgga ctacatcttg ttcatccggg 

751 ccaacgtcac cttggcgccc aaccctgacg aggtggacga agtcaggtac 

801 gtgacgcaag aggagctgcg gcagatgatg cagccggaca acgggctgca 

851 atggtcgccg tggtttcgca tcatcgccgc gcgcttcctt gagcgttggt 

901 gggctgacct ggacgcggcc ctaaacactg acaaacacga ggattgggga 

951 acggtgcatc acatcaacga agcgtgaaag cagaagctgc aggatgtgaa 

1001 gacacgtcat ggggtggaat tgcgtacttg gcagcttcgt atctcctttt 

1051 tctgagactg aacctgcagt caggtcccac aaggtcaggt aaaatggctc 

1101 gataaaatgt accgtcactt tttgtcgcgt atactgaact ccaagaggtc 

1151 aaaaaaaaaa aaaaa 
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FIG. I8B 



Haematococcus pluvialis IpiZ 

1 tggaacctgg cccggcggca gtccgatgcc gcgatgcttc gttcgttgct 

51 cagaggcctc acgcatatcc cgcgcgtgaa ctccgcccag cagcccagct 

101 gtgcacacgc gcgactccag tttaagctca ggagcatgca gctgcttgcc 

151 gaggaccgca cagaccacat gaggggtgca agcacctggg caggcgggca 

201 gtcgcaggat gagctgatgc tgaaggacga gtgcatctta gtggatgctg 

251 acgacaacat cacaggccat gccagcaagc tggagtgcca caaattccta 

301 ccacatcagc ctgcaggcct gctgcaccgg gccttctctg tgttcctgtt 

351 tgacgaccag gggcgactgc tgctgcaaca gcgtgcacgc tcaaaaatca 

401 ccttcccaag tgtgtggacg aacacctgct gcagccaccc tctacatggg 

451 cagaccccag atgaggtgga ccaactaagc caggtggccg acggcacagt 

501 acctggcgca aaagctgctg ccatccgcaa gttggagcac gagctgggga 

551 taccagcgca ccagctgccg gcaagcgcgt ttcgcttcct cacgcgtttg 

601 cactactgtg ccgcggacgt gcagccggct gcgacacaat cagcgctctg 

651 gggcgagcac gagatggact acatcttatt catccgggcc aacgtcacct 

701 tggcgcccaa ccctgacgag gtggacgaag tcaggtacgt gacgcaagag 

751 gagctgcggc agatgatgca gccggacaac gggttgcaat ggtcgccgtg 

801 gtttcgcatc atcgccgcgc gcttccttga gcgttggtgg gctgacctgg 

851 acgcggccct aaacactgac aaacacgagg attggggaac ggtgcatcac 

901 atcaacgaag cgtgaaggca gaagctgcag gatgtgaaga cacgtcatgg 

951 ggtggaattg cgtacttggc agcttcgtat ctcctttttc tgagactgaa 

1001 cctgcagagc tagagtcaat ggtgcatcat attcatcgtc tctcttttgt 

1051 tttagactaa tctgtagcta gagtcactga tgaatccttt acaactttca 

1101 aaaaaaaaa 



SUBSTITUTE SHEET (RULE 26) 



WO 99/63055 PCT/US99/12121 

27/45 09/701395 



FIG. I9A 



Lactuca sativa Ipil 

1 tgccaaaatg ttgaaatttc ccccttttaa aaccattgct accatgatct 

51 cttctccata ttcttccttc ttgctgcctc gqaaatcttc tttccctcca 

101 atgccgtctc tcgcagccgc tagtgttttc ctccaccctc tttcgtctgc 

151 cqctatgggc gartccagca tggaigctqt ccagcgacgt ctcatgttcg 

201 atgacgaaig cattttggtg gatgagaatg acaaagtggt tggccatgat 

251 actaaataca attgtcattt gatggagaag attgaaaagg gaaatatgct 

301 acacagagca ttcagtgtgt tcttgttcaa ctcgaaatat gaattactcc 

351 ttcagcaacg ttctgcaacc aaggtgactt tccctttggt atggacaaac 

401 acgtgttgca gccatccact atacagggag agtgagctta ttgacgaaaa 

451 cgcccttggg gtgaggaatg ctgcacagag gaagctcctg gatgaactcg 

501 gcatccctgg agcagatgti ccggttgatg agttcactcc attgggtcgc 

551 attctataca aggccgcatc ggatggaaag tggggagaac atgaacttga 

601 ttacctgctg tttiatggtac gigatgttgg tttggatccg aacccagatg 

651 aagtgaaaga tgtaaaatat gtgaaccggg aagagctgaa ggaattggta 

701 aggaaggcgg atgctggtga agagggtgtg aagctgtccc cgtggttcaa 

751 atigattgtc gaiaatttct tgtticagtg gtgggatcga ctccataagg 

801 gaaccctaac cgaagctatt gatatgaaaa caatccacaa actcacataa 

851 aaacactaca ctagtaggag agaggattat atgagatatt tgttatatgt 

901 gaaattgaaa ttcagatgaa tgcttgtatt tatttctatt tggacaaact 

951 tcaacttctt tttgctacct tatcagaaaa aaaaa 

FIG. I9B 

Lactuca sativa Ipi2 

1 tattcgcttc aaaatctctt ccattaactg ctcaaatctc caccttcgcc 

51 ggtcttaatc tccgccggcg cactttcacc accataaccg ccgccatggg 

101 tgacgattcc ggcatggacg ctgtccagag acgtctcatg tttgatgatg 

151 aatgcatttt ggttgatgaa aatgacaatg ttcttgggca tgataccaaa 

201 tacaattgtc acttgatgga gaagattgag aaagataatt tgcttcatag 

251 agcattcagt gtatttttat tcaattcaaa atacgaatta ctccttcagc 

301 aaaggtcaga aaccaaggtg acatttcctt tggtatggac aaacacctgt 

351 tgcagccatc cactatacag agaatcggag ttaattcccg aaaatgccct 

401 tggggtcaga aatgctgcac agaggaagct tctagatgaa ctcggratcc 

451 ctgctgaaga tgttccagtt gatgagttca caacittagg tcgcatgttq 

501 tacaaggctc catctgaigg aaaatggggt gaacatgaag ttgattacct 

551 actcttcctc gtgcgtgacg ttgccgtgaa cccaaaccct gatgaggtgg 

601 cggacattag atacgtgaac caagaagagt taaaagagtt actaaggaaq 

651 gcqgatgcgg gtgaqgaggg tttgaaattg tccccatggt ttaggctagt 

701 ggtggacaac ttcttgttca aatggtggga tcatgtccaa aagggqacac 

751 tcaatgaagc aattgacatg aaaaccattc ataagttgat atqaaaaatg 

801 gttaatattt atggtggtgg tttggagcta ataatttqtg tgttcaagtc 

851 tcggtccttc tttttttaac gttttttttt tttcttttat tgggagtgtt 

901 tattgtgtac ttgtaacgta ggccctttgg ttacgcttta agagtttaat 

951 aaagaaccac cgttaattta aaaaaaaaaa aaaaaaaa 
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FIG.20 

Chlomydomonas reinhordtii Ipil 

(Note: the isomerase cDNA probably ends at ca. base 1103; the 
second half of the cDNA is similar to extensin and other 
hydroxyproline-rich structural proteins) 

1 ggcacgagct cgagtttgtt ttaccatgac atcgggaatt tggaagcttq 

51 aactacctca attactcaag taactcgcgg caacacattt cgcqcgccat 

101 cgctgttttc tctgctccag ctaccgagca gcattgcttt agatcgcttt 

151 gatgtcataa actcccactt atatgagatc cagtttcatc gagcccaagc 

201 ccagagcgca acctgtctta agccgcggca gggcgtccat gcgcctcgcg 

251 caaagccqtg ctctcgttgc gcgtgtcagc tccgccctgt ggccgggagc 

301 aggactttca caggctcaaa gcgttgcgqt gcgaatggcg agttcgtcaa 

351 cctgggaagg cacgggcctg agccaggatg acttcatgca gcgggacgag 

401 tgcttggtgg tggacgagca ggaccggctg ctaggcaccg ccaacaagta 

451 cgactgccac cgcttcgagg cggccaaggg ccagccctgc ggccgccigc 

501 accgcgcctt ctccgtgttc cigttcaqcc ccgacggccg actgctgctg 

551 cagcagcgcg cagccagcaa ggtgacgttc ccgggtgtgt ggaccaacac 

601 ctgctgctcq cacccgctgg cgggccaggc gccggacgag gtggacctgc 

651 cggcggcggt agcctcgggc caggtqccgg gcatcaaggc qqcggcggtg 

701 cgcaagctgc agcacgagct ggggataccg ccggagcagg itcccgcctc 

751 ciccttctcc ttcctcacgc qtctgcacta ctgcgccgcc gacaccgcca 

801 cgcacggccc ggcggcggag tggggcgagc acgaggtgga ctacgtgctq 

851 ttcgtgcggc cgcagcaqcc cgtcagcctg caqcccaacc cagacgaggt 

901 ggacgccacg cgctacgtga cgctgccgga gcttcaqtcc atgatggcgg 

951 accccggcci cagctqgaqc ccctggttcc gcatcctggc cacacagccc 

1001 gccttcctgc ccgcctggtg gggcgacctg aagcggcgct ggcqcccggg 

1051 cggcagccga ctgtaggact ggggcaccat ccaccgcgtc atgtgaagaa 

1101 aaaggggaag cagqggcggg agcgggggat gaatgggaat gtqaatgcga 

1151 ttgtgatgcg gcgtggqatg aggtctgaag acagggggaa aatcgggggg 

1201 cgggcgtgag cgtgtgtgta cgtgagcgac aaagccggga ggcgqaccgc 

1251 gcgatgggta catgtgtgtg cggagggtcg gtgqgtcggt cggttgcgcg 

1301 gcatagcgtg ttgtgtgtgt gcggctgcgc gggtatqtqg gcacccggqc 

1351 acggaggaqa aggcacacgc aggtggcgcg gaggtgtgtc aggggccatg 

1401 ggcggqcctc actcctggfc gtgcccagtg gtctcgtggg cagagtgqca 

1451 qgggctgcac ccatatgagc ggcgcactgc cgcgctgggc taagtcctta 

1501 tcacttggtg aggtgqggcg agqtggctgt qggcggcggg cgcagtggca 

1551 gaaggacacg gtgtgtgagc qgtggaqctc tggccqtgcc ggccgtgagg 

1601 ggcggatagc gatatgacgt tgtgcttggc cgctgtaatg cgggagaatg 

1651 tgcaggccgc gagaagcggg cggtggcagg aggccgcagg ctgcagcacc 

1701 cgttgggqag gtgccgcctg caggcgcggc gccgggcggg cctgagtaat 

1751 gggcgcctga gtagtggcgg ccacaggagg cgcaggaggc agcagcagga 

1801 gqacgagctg gagggacccg ttggcaaccc aaggttgcgc gtgtaacata 

1851 gtggccatac aaaaaaaaaa aaaa 
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FIG. 2 I A 



Togetes erecta Ipil 

1 ccaaaaacaa ctcaaatctc ctccgtcgct cttactccqc catgggtgac 

51 gactccggca tgqatgctqt tcagcgacgt ctcatgtttq acgatgaatg 

101 cattttggtg gatgagtgtg acaatgtggt gqgacatgat accaaataca 

151 attgtcactt gatqgaqaag attgaaacag giaaaatqct qcacagagca 

201 ttcagcgttt ttctattcaa ttcaaaatac gagttacttc ttcagcaacg 

251 gtctqcaacc aaggtgacat ttcctttagt atggaccaac acctgttgca 

301 gccatccact ctacagagaa tccgagcttg ttcccgaaaa cgcccttqga 

351 qtaagaaatq ctgcacagag gaaqctgttg qatgaactcg gtatccctgc 

401 tgaaqatgtt cccqttgatc agtttactcc tttaggtcgc atgctctaca 

451 aqgctccatc tgatggaaag tggggagaac atgaacttga ctacctactt 

501 ttcatagtga gagacgttgc tgtaaacccg aacccagatg aagtggcgga 

551 tatcaaatat gtganccang aagagttaaa ggagctgcta aqqaaagcaq 

601 atgcggqqga ggaqggtttg aagctgtctc catggttcag gttagtggtt 

651 gataacttct tgttcaagtg gtgqgatcat gtgcaaaagg gtacactcac 

701 tgaagcaatt gatatgaaaa ccatacacaa gctqatatag aaacacaccc 

751 tcaaccgaaa aqttcaagcc taataattcg ggttqggtcg ggtctaccat 

801 caattgtttt tttcttttaa gaaqttttaa tctctatttg aqcatgttga 

851 ttcttgtctt ttgtgtgtaa qattttgqgt ttcqtttcaq ttgtaataat 

901 gaaccattga tggtttgcaa tttcaagttc ctatcgacai gtagtgatct 

951 aaaaaa 

F 1 6. 2 1 B 

Oryza sative Ipil 

1 cctccctttg cctcgcgcag aggcggccgc gccttctccg ccgcqaggat 

51 ggccggcgcc gccgccgccg tggaggacqc cggqatgqac gaggiccaga 

101 aqcgqctcat gttcgacgac gaatgcattt tggtggatga acaagacaat 

151 gttqttggcc atgaatcaaa atataactgc catctaatqg aaaaaatcga 

201 atctqaaaat ctacttcata gggctttcag tgtattcctq ttcaactcaa 

251 aatafqaact cctactccag caacgatctq caacaaaggt tacatttcct 

301 ctagtttgga ccaacacttg ctgcagccat cctctgtacc gtgagtctga 

351 qcttatacag gaaaactacc ttggtqttag aaatgctgct cagaggaaqc 

401 tcttggatqa gctgggcatc ccagctgaag atgtgccaqt tgaccaattc 

451 acccctctrg gtcggatgct ttacaaqgcc ccatctgaig gaaaatgqgg 

501 tgaacacgag cttgactacc tgctgttcat cgtccgcgac gtgaaggtag 

551 tcccgaaccc ggacgaagtg gccgatgtga aatacgtgag ccgtgagcag 

601 ctgaaggagc tcatccgcaa agcqqacgcc ggagaggaag gcctgaagct 

651 gtctccctgg ttccggctgg ttgttgacaa cttcctcatq ggctggtggg 

701 atcacgtcga gaaaggcacc ctcaacqagg ccgtggacat ggagaccatc 

751 cacaaqctga agtaaqgact gcgatqttgt ggctqgaaag aatgatcctg 

801 aagactctgt tcttgtgctg ctqcatatta ctcttaccag qqaagttqca 

851 gaagtcaqaa gaagcttttg tatgtttctg ggtttggagc uqqaagtqt 

901 tgggctctgc tgactgagag attcccttai aqagtqtcta tgttaattta 

951 gcaaacttct atattataca tqattaqtta attgttcggt gtctgaataa 

1001 agaacaatag catgttccat gtttatttgc t 
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Comparison using GAP program of the Genetics Computer Group 

Gap Weight: 50 Average match: 10.000 

Length Weight: 3 Average Mismatch: 0.000 

Quality: 17392 Length: 1904 

Ratio: 9.411 Gaps: 3 

Percent Similarity: 95.331 Percent Identity: 95.331 
Match display thresholds for the alignment(s) : 
' = IDENTITY : = 5 . = 1 



Adonis palaestina £ -cyclase #3 x Adonis paloestina £ -cyclase #5 

• « * • • 

1 qaqagaaaaaqaqtgttatattaatgttactgtcgcattcttqcaacac. 49 

III lITIljj IIIIIIITJIIIIIIITIIIIIIJTIIIIII. „ 

1 aaaggagtgttctattaatgttactgtcgcattcttgcaacact 44 



50 .atattcagactccattttcttgttttctcttcaaaacaacaaactaatq 98 



45 tatattcaaad 



nil iiiiniiiiiji iiiiiiiiiiiiiin 

xtcaaactccattttcttcttttctcttcaaaacc 



i 



caaaacaacaaactaatg 94 



• » • • • 

99 tqa.cqqaqtatctagctatggaactacttggtqttcqcaacctcatctc 147 

m nm II II iiiimiiiiiiiiiiriimi iiii iinii 

tqaqcaqaqtatctggctatggaactacttggtgttcgcaacctcatctc 1' 



95 tgagcagagtatctggctatggaactacttggtgttcgcaacctcatctc 144 
148 ttcttgccctgtctggacttttqgaacaaqaaaccttaqtaqttcaaaac 197 

iiiinmm iHiiiiiiiiniimiiiiiimmiiiiiii lM 

145 ttcttgccctgtgtggacttttggaacaagaaaccttagtagttcaaaac 194 



198 taqcttataacatacatcgatatqqttcttcttqtaqaqtaqattttcaa 247 

nil iJMi!l!!illI]li|M 



195 tagct 



.ataaca 



cacatcgata 



248 fflmt^^ Z 

245 gtgagagctgatggtggaagcgggagtagaagttctgttgcttataaaga 294 

a • • • • 

298 qqqttttgtqgjc|agg^ 347 
295 gggilllgiggitgaagaggalill^ 344 

# • • • * 

348 tqtttqtccaaatgcagcaaacaaagtctatggagaaacaggccaagctc 397 

■ - 1 1 1 1 111 ITI 1 1 1 1 1 1 ITJ I J 1 1TTITI I i I ITTl ! I ITI 1 1 , M 

345 tgtttgtccaaatgcagcaaacaaagtctatggagaaacaggccaagctc 394 
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FIG. 23B 



398 qccqataaqttqccacca^ 

395 gccgaiaigllgcciccaalaccllltgg 444 



it 447 



448 ^l^?^^^ 497 
445 igiaaliggllglggickgc^ 494 

495 agclagggllgaaagilggcc!^ 544 
548 aattatqgtgtgtgggaagacgagttcaaaqatcttqqacttqaacqttg 597 

III) | I Till II ] IT I I I I M I I I I I I I I I I I I I 1111 I I j 1 I I I I I rn« 

545 aattatggtgtgtgggaagacgagttcaaagatcttggacttgaacgttg 594 



598 tatcgagcatgcttggaaggacaccatcgtatatcttqacaatqatqcp 647 

[Iqataalqaiqci 



t ..m 

xgagcatgcttggaaggacaccatcg 



648 ctgtccttat 

I 111 I L 

645 ctgtcct 



gataatgatgctc 644 
itcqtqcatatqqacqaqt^ 697 
r:qcala":qqacgagiiagtcg^ 



694 



• * • 

698 qaaqaqttqctgaaaaggtgtgtcgagtcaggtgtatcatatctgaattc 747 



mm 



U\)\}\\\ II 

.tcatatctqqat 1 



695 gaggagttgctgaaaaggtgtgtgg^ 744 

• • • 

igtggaaaggatcactqaaqctqqtqatqgc 

II TTI 1 1TTIJI 1 1 JT I ITI ITTITI 111 



748 taaagtggaaaggatcactgaagctggtga^ 797 

745 taaagtggaaaggatcactgaagctggtgatggccatagccttgtagttt 794 

798 qtqaaaacqacatctttatcc^ 847 

795 glgaaaatgagaiclilalcccttgcagg 844 

848 mm imp if) i m 1 if ifilii Z 

845 gcagcttcagggaaacttttggagtatgaagtaggtggccctcgtgtttg 894 

f 



898 tqtccaaactqct 
895 tgiccaaaccgct 




aacaatccatacg 947 

....IIIIIIIIINII nfA 
ggagaacaatccatacg 944 
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FIG. 23C 



948 atcccaacttaatggtatttatggactacagagactatatgcaacagaaa 997 

„ ilimniiiiiniM] imiiiiiiimiiiiimiiimii QM 

945 atcccaacttaatggtattcatggactacagagactatatgcaacagaaa 994 

• • * • * 

998 ttacaqtqctcqqaaqaaqaatatccaacatttctctatgtcatgcccat 1047 



995 



.acagtgctcggaagaagaatatccaacai 

.tgagg 

mil 



cta^catgcccat «m 



acaqtqctcqqaaqaaqaatat^ " f " j " ] ]' j | "f j | j | 

I II J .11 1 ...J I _ J 1 . ! J 1 11 1 1 J 1 1 ii jjj. jj. jaigicalgccca! 

* 

laaaqatq 

1045 gtcgccaacaagacttttttttgaggaaacctgtttggcctcaaaagalg 1094 

• • • * • 

1098 ccatgcctttcgatctactgaagagaaaactaatgtcacgattgaagact 1147 

mi iimiiiiimiimiiiii immiiimmii ,^ 

itqccattcqatctactgaagaqaaaactgatqtcacgattgaaqact 1 144 



1048 qtcgccaacaagacttttttttgaggaaacctgtttggcctcaaaagatg 1097 

1 III 1 M 1 1 IT] I lillJinTSTTi 1 1 1 filMiTTl III 1 1 1 ITf ) 

3tcqccaacaaqacttttttttqaqqaaacctqtttqqcct( 



1095 ccatgccattcgal 



1148 ctgggtatccaagttacaaaaatttatgaagaggaatggtcttatattcc 1197 

iimmmiimiim 111111 1 mrf iiini i imp „ M 

1145 ctgggtatccaagttacaaaagtttatgaagaggaatggtcatatattcc 1194 

* 

|?ii?f??linniiiiiiiiiiiiiriiiiiiiii m 

tqttqqtqqttctttaccaaacaca 



1198 tqttqqgggttctttaccaaacacagagcaaaagaacctagcatttggtg 1247 

'.^;::niiiiiiiiiiimiiimiiiimiiiiTTiT „ AA 

1195 tgttggtggttctttaccaaacacagagcaaaagaacctagcatttggtg 1244 



1248 ctgcagcaagcatggtgcatccagcaacaggctattcggttgtacgatc 

; ill nil ill iinni i j 1 1 it iiimniiiimii i „„„ 

1245 ctgcagcaagcatggtgcatccagcaacaggctattcggttgtacggtca 1294 



jttgtacqatca 1297 



tatcaqaagctccaaaatatgcttctgtaattgcaaagattttgaagca 1! 

I I T \ IT 1 1 1 1 1 1 1 1 1 1 JT 1 JJ I IT) ! I JIT 1 1 1 1 1 (1 1 1JT M T 1 ! 

tgtcagaagctccaaaatatgcttctgtaattgcaaagattttgaagca 1. 

# • « • • 

qataactctgcatatgtggtttctggacaaagcagtgcagtaaacattt 1'. 

TMiuiiiTi iinimiii iiimn jiiiiitiiiiiiiii ,. 

gataactctgcgtatgtggtttctggacaaagtagtgcagtaaacattt L 



1398 caat ? caa ^ at ^ a ^ 1447 
1395 caalgcaagcatggagcagtctttggccaaaggagcgaaaacgtcaaaga 1444 

1448 ffiimiHiiffi z 

1445 gcattctttctttttggattagagcttattgtgcagctagatattgaagc 1494 
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1498 aaccagaacgttctttagaaccttcttccgcttgccaacttggatgtggt 1547 

„ nc I HI 'I' II I J 1 111 I TI i 1 1 JJ 1 J J I IT 1J JTI 1 1 1 1 1 JTT 1 IT ITT 1 1M „ 

1495 aaccagaacattctttagaaccttcttccgcttgccaacttggatgtggt 1544 

• * * • ♦ 

1548 ggggtttccttgggtcttcactatcatctttcgatcttgtattgttttcc 1597 

ie « TO1II1PII11IIIIIIIIII111IIIII1 TJ 11? J 11 j i t 1BM 

1545 ggggtttccttgggtcttcactatcatctttcgatctcgtcttgttttcc 1594 



1598 atgtacatgtttgttttggccccgaacagcatgaggatgtcacttgtgag 1647 

' iimiiliiuilJiii ii mmiiMimiiiimM ie/M 

...j.--.*. -^ii xqaggatgtcacttgtgag 1644 



1595 a 



gtacatgtttgttttggcgccaaacagca 



1648 acatttgctttcagatccttctggtgcagttatggttaaagcttacctcg 1697 

WW iTi ii i J ji jt in i iinimi i mil 1 1 ii it 

itttgctttcagatccttctqgtgcaqttatqgtaaqaqcttacctcq 



1645 acai 



:cg 1694 



1698 aaaggtaatc. . .tgttt 



1695 aaaggiagtctcai 



ta tgaaactatagtgt ct cattaaa taaat ga 1 744 



i 1 1 Mil mil mm ni mil 

:tcatctattattaaactctagtgtttcaccaaataaa 

f 



ga 1744 



1745 ggatccttcgtatatgtatatgatcatctctatgtatatcctatattcta 1794 
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FIG. 28A 

GAP of: Arabidopsis epsilon cyclase to Lettuce epsilon cyclase 

Gap Weight: 12 Average Match: 2.912 

Length Weight: 4 Average Mismatch: -2.003 

Quality: 1837 Length: 534 

Ratio: 3.499 Gaps: 3 

Percent Similarity: 76.381 Percent Identity: 69.905 
Match display thresholds for the alignment(s): 
" = IDENTITY : = 2 . = 1 



Arabidopsis x Lettuce 

1 MECVGARNF.AAMAVSTFPSW. . .SCRRKFPVVKRYSYRNIRFGLCSVRA 46 

III llll I III I I • -Ill : I- 

1 MEC FGARNMTATMAV FTCPRFTDCN I RH KFS LLKQRRFTN LSASSSLRQ I 50 

47 SGGGSSGSESCVAVREDFADEEDFVKAGGSEILFVQMQQNKDMDEQSKLV 96 

I MUhlMllil: IMII. I h llll 

51 KCSAKSDRCVVDKQGISVADEEDYVKAGGSELFFVQMQRTKSMESQSKLS 100 

97 DKLPPISIGDGALDHVVIGCGPAGLALAAESAKLGLKVGLIGPDLPFTNN 146 

:|| I II- II lllllllllllllllllllll HIIIIIIIIIII cn 
101 EKLAQIPIGNCILDLVVIGCGPAGLALAAESAKLGLNVGLIGPDLPFTNN 150 

147 YGVWEDEFND LG LQKC I EH VWRET I VY LDDDKP I TI GRAYGRVSRRLLH E 196 

M l : MI IM : I Ml 1 :: MI 1 1 1 1 1 1 1 1 1 1 1 1 1 I II 1 1 
151 YGVWQDEFIGLGLEGCIEHSWKDTLVYLDDADPIRIGRAYGRVHRDLLHE 200 

197 ELLRRCVESGVSYLSSKVDsiTEASDGLRLVACDDNNAIPCRLATVASGA 246 

IIIIIIIIIIMIIIIII: llll -I I: h I llllllllllll cn 
201 ELLRRCVESGVSYLSSKVERITEAPNGYSLIECEGNITIPCRLATVASGA 250 

247 ASGKLLQYEVGGPRVCVQTAYGVEVEVENSPYDPDQMVFMDYRDYTNEKV 296 

INI hll.NIIMIIIIIhllllll.lllll llllllll:- I 
251 ASGKFLEYELGGPRVCVQTAYGIEVEVENNPYDPDLMVFMDYRDFSKHKP 300 

297 RSLEAEYPTFLY/^FWKSRLFFEETCU^KIDVMPFDLLKTKLMLRLDTL 346 

||||. Illlll I I- •::|lllllllh: lll-IIMII II 
301 ESLEAiaPTFLYVTWSPTKIFFEETClJ\SREAMPFNLLKSKL>1SRLKAM 350 
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FIG. 28B 

347 GI RI LKTYEEEWSYIPVGGSLPNTEQKNLAFGAAASMVHPATGYSVVRSL 396 
351 gIrITRTYEEBOT 400 

397 SEAPKYASVIAEI LREETTKQINS NISRQAWDTLWPPERKRQRAF 441 

401 S^FWAAV^ 450 

442 FLFGUUJVQFDTEGIRSFFRTFFRLPKWW 491 

Mill. II I II Ullllllllllll IMM-!-! I|::|lll 
451 FLFGL^HIVUCDLEGTRTFFRTFFRLPKWMWWGFLGSSLSSTDLIIFALY 500 

492 MFVISPNNLRKGLINHLISDPTGATNIKTYLKV* 525 

IIIU..II h Ihlllllllhl II si 

501 MFVIAPHSLRMELVRHLLSDPTGATMVKAYLTI* 534 
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Docket No . 108 172-00022 ARENT FOX KINTNER PLOTKIN & KAHN, PLLC 

Nikaido, Marmelstein, Murray & Oram Intellectual Property Group 

Declaration For U.S. Patent Application 

As a below named inventor, I hereby declare that: 

My residence, post office address and citizenship are as stated below my name. 

I believe I am the original, first and sole inventor (if only one name is listed below) or an original, first and joint inventor (if plural 
names are listed below) of the subject matter which is claimed and for which a patent is sought on the invention entitled 
(Insert Title) GENES OF CARQTENQID BIOSYNTHESIS AND METABOLISM AND METHODS OF USE THEREOF 

the specification of which is attached hereto unless the following box is checked: 

13 was filed on June 2, 1999 As PCT International Application 

Number PCT/US99/ 12121 and was amended on 



and/or was filed on December 4, 2000 As U.S. Patent Application 

Number and was amended on 



I hereby state that I have reviewed and understand the contents of the above-identified specification, including the claim(s), as amended 
by any amendment referred to above. 

I acknowledge the duty to disclose information which is material to patentability as defined in 37 C.F.R. §1.56. 
I hereby claim foreign priority benefits under 35 U.S.C. §119(a)-(d) or §365(b) of any foreign application(s) for patent or inventor's 
certificate, or §365(a) of any PCT International application which designated at least one country other than the United States, listed 
below and have also identified below any foreign application for patent or inventor's certificate or PCT International Application having 
a filing date before that of the application(s) for which priority is claimed: 

Priority Claimed 



(List prior 

foreign 

applications) 


(Number) 


(Country) 


(Day/Month/Year Filed) 


□ Yes 


□ No 


(Number) 


(Country) 


(Day/Month/Year Filed) 


□ Yes 


□ No 




(Number) 


(Country) 


(Day/Month/Year Filed) 







i:: JI hereby claim the benefit under 35 U.S.C. §1 19(e) of any United States provisional application(s) listed below. 



(Application Number) (Filing Date) 



(Application Number) (Filing Date) 

□ See attached list for additional prior foreign or provisional applications. 



QI hereby claim the benefit under 35 U.S.C. §120 of any United States application(s) or §365(c) of any PCT International application(s) 
s ^designating the United States of America listed below and, insofar as the subject matter of each of the claims of this application is not 
■'"""disclosed in the prior application(s) (U.S. or PCT) in the manner provided by the first paragraph of 35, U.S.C. §112, 1 acknowledge the 

duty to disclose information which is material to patentability as defined in 37 C.F.R. §1.56 which became available between the filing 

date of the prior application and the national or PCT International filing date of this application. 

(List prior U.S. 09/088,724 June 2, 1998 

Applications or (Application Serial No.) (Filing Date) (Status) (patented, pending, abandoned) 

PCT International 09/088,725 June 2, 1998 

applications (Application Serial No.) (Filing Date) (Status) (patented, pending, abandoned) 

designating the U S ) 

And I hereby appoint the firm of Arent Fox, Customer Number 004372 including as principal attorneys: Robert B. Murray, Reg. No. 
22,980; Charles M. Marmelstein, Reg. No. 25,895; George E. Oram, Jr., Reg. No. 27,931; Douglas H. Goldhush, Reg. No. 33,125; 
David T. Nikaido, Reg. No. 22,663; Richard J. Berman, Reg. No. 39,107; King L. Wong, Reg. No. 37,500; James A. Poulos, III, 
Reg. No. 31,714; Murat Ozgu, Reg. No. 44,275; Robert K. Carpenter, Reg. No. 34,794; Gregory B. Kang, Reg. No. 45,273; Rustan 
Hill, Reg. No. 37,351; Carl Schaukowitch, Reg. No. 29,211; Kevin Turner, Reg. No. 43,437; Rhonda C. Barton, Reg. No. P47,271 
and Hans J. Crosby, Reg. No. 44,634. 

Please direct all communications to the following address: ARENT FOX KINTNER PLOTKIN & KAHN, PLLC 

1050 Connecticut Avenue, N.W., Suite 600 
Washington, D.C. 20036-5339 

Telephone No. (202) 857-6000; Facsimile No. (202) 638-4810 



The undersigned hereby authorizes the U.S. attorneys named herein to accept and follow instructions from the undersigned's assignee, if 
any, and/or, if the undersigned is not a resident of the United States, the undersigned's domestic attorney, patent attorney or patent 
agent, as to any action to be take in the Patent and Trademark Office regarding this application without direct communication between 
the U.S. attorneys and the undersigned. In the event of a change in the person(s) from whom instructions may be taken, the U.S. 
attorneys named herein will be so notified by the undersigned 

I hereby declare that all statements made herein of my own knowledge are true and that all statements made on information and belief 
are believed to be true; and further, that these statements were made with the knowledge that willful false statements and the like so 
made are punishable by fine or imprisonment, or both, under Section 1001 of Title 18 of the United States Code and that such willful 
false statements may jeopardize the validity of the application or any patent issued thereon. 



Full name of sole or first inventor Francis X. CUNNINGHAM, Jr. 
Inventor's signature 



Date 

Residence 2727 Washington Avenue, Chevy Chase, MP 20815 

Citizenship American 

Post Office Address Same as the above — — 

. 

-< ^ \j mmG °f so * e or second inv entor Zairen SUN 

Inventor's signature ~~ ^^DCv^- Si- Utd/cu^Oj 

Residence 34 0 5-Tul a a e Drive, jP22, Hyattsvillc, MD -3e?8» f Ofr j C&OP&rSivvxe-- Qt> c\ 
citizenship * perm^^re^Tgfcuut , — \p2c^n{pJ\Mi> a^&r^T km Yi ? 

Post OfficeAddress I Same as the above ' 

Full name of sole or third inventor 

Inventor's signature 



Date 

Residence 

Citizenship 



Post Office Address 
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; JI hereby claim the benefit under 35 U.S.C. §1 19(e) of any United States provisional application(s) listed below. 



(Application Number) (Filing Date) 



(Application Number) (Filing Date) 
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